diff mbox series

[v3,1/7] remote: add reflog check for "force-if-includes"

Message ID 20200913145413.18351-2-shrinidhi.kaushik@gmail.com (mailing list archive)
State Superseded
Headers show
Series push: add "--[no-]force-if-includes" | expand

Commit Message

Srinidhi Kaushik Sept. 13, 2020, 2:54 p.m. UTC
Add a check to verify if the remote-tracking ref of the local branch is
reachable from one of its "reflog" entries; `set_ref_status_for_push`
updated to add a reject reason and disallow the forced push if the
check fails.

When a local branch that is based on a remote ref, has been rewound and
is to be force pushed on the remote, "apply_push_force_if_includes()"
runs a check that ensure any updates to remote-tracking refs that may
have happened (by push from another repository) in-between the time of
the last checkout, and right before the time of push by rejecting the
forced update.

The struct "ref" has three new bit-fields:
  * if_includes: Set when we have to run the new check on the ref.
  * is_tracking: Set when the remote ref was marked as "use_tracking"
                 or "use_tracking_for_rest" by compare-and-swap.
  * unreachable: Set if the ref is unreachable from any of the "reflog"
                 entries of its local counterpart.

When "--force-with-includes" is used along with "--force-with-lease",
the check is run only for refs marked as "is_tracking".

The enum "status" updated to include "REF_STATUS_REJECT_REMOTE_UPDATED"
to imply that the ref failed the check.

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
---
 remote.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++++---
 remote.h |  14 +++++-
 2 files changed, 141 insertions(+), 8 deletions(-)

Comments

Junio C Hamano Sept. 14, 2020, 8:17 p.m. UTC | #1
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> Add a check to verify if the remote-tracking ref of the local branch is
> reachable from one of its "reflog" entries; `set_ref_status_for_push`
> updated to add a reject reason and disallow the forced push if the
> check fails.

I have to wonder (not objecting to, just wondering about) if it is a
good assumption that the current branch must be where we should have
seen the tip of the other side we are about to lose.  I ask because
when I do a large rewrite I often am on a detached HEAD most of the
time, and after everything looks sensible in the rewritten result,
I "checkout -B" the local branch.

We could reduce the rate of false positive ("no you've not looked at
what you are about to discard, so we won't let you force") by
checking reflogs of all the local branches and HEAD, but that may be
too much.  I wonder if checking reflog entries only for HEAD (and
not any of the current local branches) would be a good compromise.
Junio C Hamano Sept. 14, 2020, 8:31 p.m. UTC | #2
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> The struct "ref" has three new bit-fields:
>   * if_includes: Set when we have to run the new check on the ref.
>   * is_tracking: Set when the remote ref was marked as "use_tracking"
>                  or "use_tracking_for_rest" by compare-and-swap.

... meaning that --force-with-lease with an explicit "the tip must
still be at this exact commit" won't use the extra check to loosen
the condition?  That sounds sensible.

>   * unreachable: Set if the ref is unreachable from any of the "reflog"
>                  entries of its local counterpart.

The same comment applies on "reflog of which branch should we
check"; I suspect that checking HEAD and without checking merge base
may prove to be a good way to go.

> + */
> +void apply_push_force_if_includes(struct ref*, int);

SP between ref and '*'?
Junio C Hamano Sept. 14, 2020, 9:13 p.m. UTC | #3
Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:

> +/*
> + * Check for reachability of a remote-tracking
> + * ref in the reflog entries of its local ref.
> + */
> +void check_reflog_for_ref(struct ref *r_ref)

Make it file-scope static.
Srinidhi Kaushik Sept. 16, 2020, 10:51 a.m. UTC | #4
Hello,

On 09/14/2020 13:17, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
> 
> > Add a check to verify if the remote-tracking ref of the local branch is
> > reachable from one of its "reflog" entries; `set_ref_status_for_push`
> > updated to add a reject reason and disallow the forced push if the
> > check fails.
> 
> I have to wonder (not objecting to, just wondering about) if it is a
> good assumption that the current branch must be where we should have
> seen the tip of the other side we are about to lose.  I ask because
> when I do a large rewrite I often am on a detached HEAD most of the
> time, and after everything looks sensible in the rewritten result,
> I "checkout -B" the local branch.
> 
> We could reduce the rate of false positive ("no you've not looked at
> what you are about to discard, so we won't let you force") by
> checking reflogs of all the local branches and HEAD, but that may be
> too much.  I wonder if checking reflog entries only for HEAD (and
> not any of the current local branches) would be a good compromise.

One scenario I can think of is when there are multiple local branches
that track the same remote. Let's say we have two branches "A" and "B"
and they both track "A" on the remote. If we are currently on "A",
and then we decide to rebase on "origin/A" (after a push from another
repository). Then, if we (accidentally) switch to "B", and force update
with `--force-if-includes` it will _not_ be rejected because HEAD's
reflog has a record of the checkout and there will be an overwrite
if we check only HEAD's reflog.

Thanks.
Johannes Schindelin Sept. 16, 2020, 12:35 p.m. UTC | #5
Hi Srinidhi,

On Sun, 13 Sep 2020, Srinidhi Kaushik wrote:

> diff --git a/remote.c b/remote.c
> index 420150837b..e4b2d85a6f 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -1484,6 +1484,36 @@ void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
>  				force_ref_update = 1;
>  		}
>
> +		/*
> +		 * If the tip of the remote-tracking ref is unreachable
> +		 * from any reflog entry of its local ref indicating a
> +		 * possible update since checkout; reject the push.
> +		 *
> +		 * There is no need to check for reachability, if the
> +		 * ref is marked for deletion.
> +		 */
> +		if (ref->if_includes && !ref->deletion) {
> +			/*
> +			 * If `force_ref_update' was previously set by
> +			 * "compare-and-swap", and we have to run this
> +			 * check, reset it back to the original value
> +			 * and update it depending on the status of this
> +			 * check.
> +			 */
> +			force_ref_update = ref->force || force_update;
> +
> +			if (ref->unreachable)
> +				reject_reason =
> +					REF_STATUS_REJECT_REMOTE_UPDATED;
> +			else
> +				/*
> +				 * If updates from the remote-tracking ref
> +				 * have been integrated locally; force the
> +				 * update.
> +				 */
> +				force_ref_update = 1;
> +		}
> +
>  		/*
>  		 * If the update isn't already rejected then check
>  		 * the usual "must fast-forward" rules.
> @@ -2272,11 +2302,74 @@ static int remote_tracking(struct remote *remote, const char *refname,
>  	return 0;
>  }
>
> +static int ref_reachable(struct object_id *o_oid, struct object_id *n_oid,
> +			 const char *ident, timestamp_t timestamp, int tz,
> +			 const char *message, void *cb_data)
> +{
> +	int ret = 0;
> +	struct object_id *r_oid = cb_data;
> +
> +	ret = oideq(n_oid, r_oid);
> +	if (!ret) {

Rather than having the largest part of the actual code statements in this
function indented, it would make more sense to write

	if (oideq(n_oid, r_oid))
		return 1;

	if (!(local = lookup...) ||
	    !(remote = lookup...))
		return 0;

	return in_merge_bases(remote, local);

> +		struct commit *loc = lookup_commit_reference(the_repository,
> +							     n_oid);
> +		struct commit *rem = lookup_commit_reference(the_repository,
> +							     r_oid);
> +		ret = (loc && rem) ? in_merge_bases(rem, loc) : 0;
> +	}

This chooses the strategy of iterating over the reflog just once, and at
every step first testing whether the respective reflog entry is identical
to the remote-tracking branch tip. Only when they are different do we test
whether the remote-tracking branch tip is at least reachable from the
reflog entry.

Let's assume that our local branch has 20 reflog entries, and the 4th one
is identical to the current tip of the remote-tracking branch. Then we
tested reachability 3 times. But that test is rather expensive.

Therefore, I would have preferred to have a call to
`for_each_reflog_ent_reverse()` with a callback function that only returns
the `oideq()` result, and only if the return value of that call is 0, I
would have wanted to see another call to `for_each_reflog_ent_reverse()`
to go through, this time looking for reachability.

> +
> +	return ret;
> +}
> +
> +/*
> + * Iterate through the reflog of a local branch and check
> + * if the tip of the remote-tracking branch is reachable
> + * from one of the entries.
> + */
> +static int ref_reachable_from_reflog(const struct object_id *r_oid,
> +				     const struct object_id *l_oid,
> +				     const char *local_ref_name)
> +{
> +	int ret = 0;
> +	struct commit *r_commit, *l_commit;
> +
> +	l_commit = lookup_commit_reference(the_repository, l_oid);
> +	r_commit = lookup_commit_reference(the_repository, r_oid);

At this point, we already LOOked up `r_commit`. But we don't pass that to
`ref_reachable()` at any point (instead passing only `r_oid`), so we have
to perform the lookup again.

That's wasteful. Shouldn't we pass `r_commit` directly?

With the two-pass strategy I outlined above, the first pass would use
`r_oid`, and only when the second pass is necessary would we resort to
calling the expensive reachability check.

> +
> +	/*
> +	 * If the remote-tracking ref is an ancestor of the local
> +	 * ref (a merge, for instance) there is no need to iterate
> +	 * through the reflog entries to ensure reachability; it
> +	 * can be skipped to return early instead.
> +	 */
> +	ret = (r_commit && l_commit) ? in_merge_bases(r_commit, l_commit) : 0;

Correct me if I am wrong, but isn't the first reflog entry
(`<remote-branch>@{0}`) identical to `r_commit`? In that case, the first
iteration of the second pass over the reflog would trivially perform this
check, and we do not need to duplicate the logic here.

> +	if (!ret)
> +		ret = for_each_reflog_ent_reverse(local_ref_name, ref_reachable,
> +						  (struct object_id *)r_oid);
> +
> +	return ret;
> +}
> +
> +/*
> + * Check for reachability of a remote-tracking
> + * ref in the reflog entries of its local ref.
> + */
> +void check_reflog_for_ref(struct ref *r_ref)
> +{
> +	struct object_id r_oid;
> +	struct ref *l_ref = get_local_ref(r_ref->name);
> +
> +	if (r_ref->if_includes && l_ref && !read_ref(l_ref->name, &r_oid))

If `r_ref->if_includes` is 0, we do not even have to get the local ref,
correct? It would make `check_reflog_for_ref()` much easier to read for me
if it was only called when that flag was already verified to be 1, and
then followed this structure:

	if (!l_ref)
		return;
	if (read_ref(...))
		warning(_("ignoring stale remote branch information ..."));
	else
		r_ref->unreachable = ...

Also, it might make a lot more sense to rename `check_reflog_for_ref()` to
`check_if_includes_upstream()`, and to rename `r_ref` to `local` and
`l_ref` to `remote_tracking` or something like that: nothing is inherently
"left" or "right" about those refs.

> +		r_ref->unreachable = !ref_reachable_from_reflog(&r_ref->old_oid,
> +								&r_oid,
> +								l_ref->name);
> +}
> +
>  static void apply_cas(struct push_cas_option *cas,
>  		      struct remote *remote,
>  		      struct ref *ref)
>  {
> -	int i;
> +	int i, is_tracking = 0;
>
>  	/* Find an explicit --<option>=<name>[:<value>] entry */
>  	for (i = 0; i < cas->nr; i++) {
> @@ -2288,16 +2381,26 @@ static void apply_cas(struct push_cas_option *cas,
>  			oidcpy(&ref->old_oid_expect, &entry->expect);
>  		else if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
>  			oidclr(&ref->old_oid_expect);
> -		return;
> +		else
> +			is_tracking = 1;

As part of `remote_tracking()`, we already looked up the branch name.
Since we need it in the `is_tracking` case, maybe this should not be a
Boolean anymore but store a copy of the remote-tracking branch name
instead?

Oh, following the code path all the way down to
`match_name_with_pattern()`, it seems that `remote_tracking()`'s `dst`
variable _already_ contains a copy (which means that that memory is
leaked, right?).

> +		break;
>  	}
>
>  	/* Are we using "--<option>" to cover all? */
> -	if (!cas->use_tracking_for_rest)
> -		return;
> +	if (cas->use_tracking_for_rest) {
> +		ref->expect_old_sha1 = 1;
> +		if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
> +			oidclr(&ref->old_oid_expect);
> +		else
> +			is_tracking = 1;
> +	}
> +
> +	/*
> +	 * Mark this ref to be checked if "--force-if-includes" is
> +	 * specified as an argument along with "compare-and-swap".
> +	 */
> +	ref->is_tracking = is_tracking;
>
> -	ref->expect_old_sha1 = 1;
> -	if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
> -		oidclr(&ref->old_oid_expect);
>  }
>
>  void apply_push_cas(struct push_cas_option *cas,
> @@ -2308,3 +2411,21 @@ void apply_push_cas(struct push_cas_option *cas,
>  	for (ref = remote_refs; ref; ref = ref->next)
>  		apply_cas(cas, remote, ref);
>  }
> +
> +void apply_push_force_if_includes(struct ref *remote_refs, int used_with_cas)
> +{
> +	struct ref *ref;
> +	for (ref = remote_refs; ref; ref = ref->next) {
> +		/*
> +		 * If "compare-and-swap" is used along with option, run the
> +		 * check on refs that have been marked to do so. Otherwise,
> +		 * all refs will be checked.
> +		 */
> +		if (used_with_cas)
> +			ref->if_includes = ref->is_tracking;
> +		else
> +			ref->if_includes = 1;
> +
> +		check_reflog_for_ref(ref);
> +	}
> +}
> diff --git a/remote.h b/remote.h
> index 5e3ea5a26d..1618ba892b 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -104,7 +104,10 @@ struct ref {
>  		forced_update:1,
>  		expect_old_sha1:1,
>  		exact_oid:1,
> -		deletion:1;
> +		deletion:1,
> +		if_includes:1, /* If "--force-with-includes" was specified.  */
> +		is_tracking:1, /* If "use_tracking[_for_rest]" is set (CAS). */
> +		unreachable:1; /* For "if_includes"; unreachable in reflog.  */
>
>  	enum {
>  		REF_NOT_MATCHED = 0, /* initial value */
> @@ -134,6 +137,7 @@ struct ref {
>  		REF_STATUS_REJECT_NEEDS_FORCE,
>  		REF_STATUS_REJECT_STALE,
>  		REF_STATUS_REJECT_SHALLOW,
> +		REF_STATUS_REJECT_REMOTE_UPDATED,
>  		REF_STATUS_UPTODATE,
>  		REF_STATUS_REMOTE_REJECT,
>  		REF_STATUS_EXPECTING_REPORT,
> @@ -346,4 +350,12 @@ int parseopt_push_cas_option(const struct option *, const char *arg, int unset);
>  int is_empty_cas(const struct push_cas_option *);
>  void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
>
> +/*
> + * Runs when "--force-if-includes" is specified.
> + * Checks if the remote-tracking ref was updated (since checkout)
> + * implicitly in the background and verify that changes from the
> + * updated tip have been integrated locally, before pushing.
> + */
> +void apply_push_force_if_includes(struct ref*, int);

This function is not even hooked up in this patch, right? I don't think
that it makes sense to introduce it without a caller, in particular since
it makes it harder to guess what those parameters might be used for.

In general, it appears to me as if the code worked way too hard to
accomplish something that should be a lot simpler: when
`--force-if-includes` is passed, it should piggy-back on top of the
`--force-with-lease` code path, and just add yet another check on top.

With that in mind, I would have expected something more in line with this:

-- snip --
 struct push_cas_option {
 	unsigned use_tracking_for_rest:1;
 	struct push_cas {
 		struct object_id expect;
-		unsigned use_tracking:1;
+		enum {
+			PUSH_NO_CAS = 0,
+			PUSH_CAS_USE_TRACKING,
+			PUSH_CAS_IF_INCLUDED
+		} mode;
 		char *refname;
 	} *entry;
 	int nr;
 	int alloc;
 };
-- snap --

and then adjusting the respective code paths accordingly.

Ciao,
Dscho

> +
>  #endif
> --
> 2.28.0
>
>
Srinidhi Kaushik Sept. 19, 2020, 5:01 p.m. UTC | #6
Hi Johannes,

On 09/16/2020 14:35, Johannes Schindelin wrote:

> > [...]
> > +		struct commit *loc = lookup_commit_reference(the_repository,
> > +							     n_oid);
> > +		struct commit *rem = lookup_commit_reference(the_repository,
> > +							     r_oid);
> > +		ret = (loc && rem) ? in_merge_bases(rem, loc) : 0;
> > +	}
> 
> This chooses the strategy of iterating over the reflog just once, and at
> every step first testing whether the respective reflog entry is identical
> to the remote-tracking branch tip. Only when they are different do we test
> whether the remote-tracking branch tip is at least reachable from the
> reflog entry.
> 
> Let's assume that our local branch has 20 reflog entries, and the 4th one
> is identical to the current tip of the remote-tracking branch. Then we
> tested reachability 3 times. But that test is rather expensive.
> 
> Therefore, I would have preferred to have a call to
> `for_each_reflog_ent_reverse()` with a callback function that only returns
> the `oideq()` result, and only if the return value of that call is 0, I
> would have wanted to see another call to `for_each_reflog_ent_reverse()`
> to go through, this time looking for reachability.

OK, you're right about this. Will update check to use the two-step
approach.

> > [...] 
> > +	int ret = 0;
> > +	struct commit *r_commit, *l_commit;
> > +
> > +	l_commit = lookup_commit_reference(the_repository, l_oid);
> > +	r_commit = lookup_commit_reference(the_repository, r_oid);
> 
> At this point, we already LOOked up `r_commit`. But we don't pass that to
> `ref_reachable()` at any point (instead passing only `r_oid`), so we have
> to perform the lookup again.
> 
> That's wasteful. Shouldn't we pass `r_commit` directly?

Hmm, I suppose we can. Not sure why I did that in the first place. 

> With the two-pass strategy I outlined above, the first pass would use
> `r_oid`, and only when the second pass is necessary would we resort to
> calling the expensive reachability check.
> 
> > +
> > +	/*
> > +	 * If the remote-tracking ref is an ancestor of the local
> > +	 * ref (a merge, for instance) there is no need to iterate
> > +	 * through the reflog entries to ensure reachability; it
> > +	 * can be skipped to return early instead.
> > +	 */
> > +	ret = (r_commit && l_commit) ? in_merge_bases(r_commit, l_commit) : 0;
> 
> Correct me if I am wrong, but isn't the first reflog entry
> (`<remote-branch>@{0}`) identical to `r_commit`? In that case, the first
> iteration of the second pass over the reflog would trivially perform this
> check, and we do not need to duplicate the logic here.

That's a correct assumption; the second pass will check for this.

> > +	if (!ret)
> > +		ret = for_each_reflog_ent_reverse(local_ref_name, ref_reachable,
> > +						  (struct object_id *)r_oid);
> > +
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Check for reachability of a remote-tracking
> > + * ref in the reflog entries of its local ref.
> > + */
> > +void check_reflog_for_ref(struct ref *r_ref)
> > +{
> > +	struct object_id r_oid;
> > +	struct ref *l_ref = get_local_ref(r_ref->name);
> > +
> > +	if (r_ref->if_includes && l_ref && !read_ref(l_ref->name, &r_oid))
> 
> If `r_ref->if_includes` is 0, we do not even have to get the local ref,
> correct? It would make `check_reflog_for_ref()` much easier to read for me
> if it was only called when that flag was already verified to be 1, and
> then followed this structure:
> 
> 	if (!l_ref)
> 		return;
> 	if (read_ref(...))
> 		warning(_("ignoring stale remote branch information ..."));
> 	else
> 		r_ref->unreachable = ...
> 
> Also, it might make a lot more sense to rename `check_reflog_for_ref()` to
> `check_if_includes_upstream()`, and to rename `r_ref` to `local` and
> `l_ref` to `remote_tracking` or something like that: nothing is inherently
> "left" or "right" about those refs.

Now that I think about it, we don't need te call to "read_ref()" at all
because the reflog will have the OID we need for checking. As to
"{l,r}_ref", they were meant to be [l]ocal and [r]emote. :)

> > +		r_ref->unreachable = !ref_reachable_from_reflog(&r_ref->old_oid,
> > +								&r_oid,
> > +								l_ref->name);
> > +}
> > +
> >  static void apply_cas(struct push_cas_option *cas,
> >  		      struct remote *remote,
> >  		      struct ref *ref)
> >  {
> > -	int i;
> > +	int i, is_tracking = 0;
> >
> >  	/* Find an explicit --<option>=<name>[:<value>] entry */
> >  	for (i = 0; i < cas->nr; i++) {
> > @@ -2288,16 +2381,26 @@ static void apply_cas(struct push_cas_option *cas,
> >  			oidcpy(&ref->old_oid_expect, &entry->expect);
> >  		else if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
> >  			oidclr(&ref->old_oid_expect);
> > -		return;
> > +		else
> > +			is_tracking = 1;
> 
> As part of `remote_tracking()`, we already looked up the branch name.
> Since we need it in the `is_tracking` case, maybe this should not be a
> Boolean anymore but store a copy of the remote-tracking branch name
> instead?
> 
> Oh, following the code path all the way down to
> `match_name_with_pattern()`, it seems that `remote_tracking()`'s `dst`
> variable _already_ contains a copy (which means that that memory is
> leaked, right?).

I'm not sure I understand what you mean. The call to "remote_tracking()"
gets the OID of the remote ref and writes it to "old_oid_expect".
We don't need "dst" from there because "old_oid" is sufficient for
checking the reflog entries. However, calling "get_local_ref()" is
necessary to get the local branch name. Also, why does "is_tracking"
have to be a Boolean? We already have "ref->name", right?

Anyway, I think we can get rid of "is_tracking" altogether and just use
"if_includes".

> > [...] 
> > +/*
> > + * Runs when "--force-if-includes" is specified.
> > + * Checks if the remote-tracking ref was updated (since checkout)
> > + * implicitly in the background and verify that changes from the
> > + * updated tip have been integrated locally, before pushing.
> > + */
> > +void apply_push_force_if_includes(struct ref*, int);
> 
> This function is not even hooked up in this patch, right? I don't think
> that it makes sense to introduce it without a caller, in particular since
> it makes it harder to guess what those parameters might be used for.
> 
> In general, it appears to me as if the code worked way too hard to
> accomplish something that should be a lot simpler: when
> `--force-if-includes` is passed, it should piggy-back on top of the
> `--force-with-lease` code path, and just add yet another check on top.

Well, it isn't included in this commit, because it was called from
"transport.c" and "send-pack.c", I decided to put that in another
commit. I will fix that in the next patch series.

> With that in mind, I would have expected something more in line with this:
> 
> -- snip --
>  struct push_cas_option {
>  	unsigned use_tracking_for_rest:1;
>  	struct push_cas {
>  		struct object_id expect;
> -		unsigned use_tracking:1;
> +		enum {
> +			PUSH_NO_CAS = 0,
> +			PUSH_CAS_USE_TRACKING,
> +			PUSH_CAS_IF_INCLUDED
> +		} mode;
>  		char *refname;
>  	} *entry;
>  	int nr;
>  	int alloc;
>  };
> -- snap --
> 
> and then adjusting the respective code paths accordingly.

OK, I will clean it up and get rid of the redundant/wasteful
calls and data being passed around. Regarding the new "enum",
I feel that just having a new bit-field "use_force_if_includes"
might be sufficient for "push_cas_option". The idea is to set it
to "1" when "--force-if-includes" is specified along with
"--force-with-lease", and "apply_push_cas()" will run the check
and set "ref->unreachable" depending on the check status.
To clarify, you would have to specify something like this to
enable the check:

   git push --force-if-includes --force-with-lease=master [...]

Then, having a configuration setting "push.useForceIfIncludes" would
be the same as running "git-push" like mentioned above. This new option
will be a "no-op" if specified otherwise. Would that be acceptable?

Thanks again, for a thorough review.
diff mbox series

Patch

diff --git a/remote.c b/remote.c
index 420150837b..e4b2d85a6f 100644
--- a/remote.c
+++ b/remote.c
@@ -1484,6 +1484,36 @@  void set_ref_status_for_push(struct ref *remote_refs, int send_mirror,
 				force_ref_update = 1;
 		}
 
+		/*
+		 * If the tip of the remote-tracking ref is unreachable
+		 * from any reflog entry of its local ref indicating a
+		 * possible update since checkout; reject the push.
+		 *
+		 * There is no need to check for reachability, if the
+		 * ref is marked for deletion.
+		 */
+		if (ref->if_includes && !ref->deletion) {
+			/*
+			 * If `force_ref_update' was previously set by
+			 * "compare-and-swap", and we have to run this
+			 * check, reset it back to the original value
+			 * and update it depending on the status of this
+			 * check.
+			 */
+			force_ref_update = ref->force || force_update;
+
+			if (ref->unreachable)
+				reject_reason =
+					REF_STATUS_REJECT_REMOTE_UPDATED;
+			else
+				/*
+				 * If updates from the remote-tracking ref
+				 * have been integrated locally; force the
+				 * update.
+				 */
+				force_ref_update = 1;
+		}
+
 		/*
 		 * If the update isn't already rejected then check
 		 * the usual "must fast-forward" rules.
@@ -2272,11 +2302,74 @@  static int remote_tracking(struct remote *remote, const char *refname,
 	return 0;
 }
 
+static int ref_reachable(struct object_id *o_oid, struct object_id *n_oid,
+			 const char *ident, timestamp_t timestamp, int tz,
+			 const char *message, void *cb_data)
+{
+	int ret = 0;
+	struct object_id *r_oid = cb_data;
+
+	ret = oideq(n_oid, r_oid);
+	if (!ret) {
+		struct commit *loc = lookup_commit_reference(the_repository,
+							     n_oid);
+		struct commit *rem = lookup_commit_reference(the_repository,
+							     r_oid);
+		ret = (loc && rem) ? in_merge_bases(rem, loc) : 0;
+	}
+
+	return ret;
+}
+
+/*
+ * Iterate through the reflog of a local branch and check
+ * if the tip of the remote-tracking branch is reachable
+ * from one of the entries.
+ */
+static int ref_reachable_from_reflog(const struct object_id *r_oid,
+				     const struct object_id *l_oid,
+				     const char *local_ref_name)
+{
+	int ret = 0;
+	struct commit *r_commit, *l_commit;
+
+	l_commit = lookup_commit_reference(the_repository, l_oid);
+	r_commit = lookup_commit_reference(the_repository, r_oid);
+
+	/*
+	 * If the remote-tracking ref is an ancestor of the local
+	 * ref (a merge, for instance) there is no need to iterate
+	 * through the reflog entries to ensure reachability; it
+	 * can be skipped to return early instead.
+	 */
+	ret = (r_commit && l_commit) ? in_merge_bases(r_commit, l_commit) : 0;
+	if (!ret)
+		ret = for_each_reflog_ent_reverse(local_ref_name, ref_reachable,
+						  (struct object_id *)r_oid);
+
+	return ret;
+}
+
+/*
+ * Check for reachability of a remote-tracking
+ * ref in the reflog entries of its local ref.
+ */
+void check_reflog_for_ref(struct ref *r_ref)
+{
+	struct object_id r_oid;
+	struct ref *l_ref = get_local_ref(r_ref->name);
+
+	if (r_ref->if_includes && l_ref && !read_ref(l_ref->name, &r_oid))
+		r_ref->unreachable = !ref_reachable_from_reflog(&r_ref->old_oid,
+								&r_oid,
+								l_ref->name);
+}
+
 static void apply_cas(struct push_cas_option *cas,
 		      struct remote *remote,
 		      struct ref *ref)
 {
-	int i;
+	int i, is_tracking = 0;
 
 	/* Find an explicit --<option>=<name>[:<value>] entry */
 	for (i = 0; i < cas->nr; i++) {
@@ -2288,16 +2381,26 @@  static void apply_cas(struct push_cas_option *cas,
 			oidcpy(&ref->old_oid_expect, &entry->expect);
 		else if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
 			oidclr(&ref->old_oid_expect);
-		return;
+		else
+			is_tracking = 1;
+		break;
 	}
 
 	/* Are we using "--<option>" to cover all? */
-	if (!cas->use_tracking_for_rest)
-		return;
+	if (cas->use_tracking_for_rest) {
+		ref->expect_old_sha1 = 1;
+		if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
+			oidclr(&ref->old_oid_expect);
+		else
+			is_tracking = 1;
+	}
+
+	/*
+	 * Mark this ref to be checked if "--force-if-includes" is
+	 * specified as an argument along with "compare-and-swap".
+	 */
+	ref->is_tracking = is_tracking;
 
-	ref->expect_old_sha1 = 1;
-	if (remote_tracking(remote, ref->name, &ref->old_oid_expect))
-		oidclr(&ref->old_oid_expect);
 }
 
 void apply_push_cas(struct push_cas_option *cas,
@@ -2308,3 +2411,21 @@  void apply_push_cas(struct push_cas_option *cas,
 	for (ref = remote_refs; ref; ref = ref->next)
 		apply_cas(cas, remote, ref);
 }
+
+void apply_push_force_if_includes(struct ref *remote_refs, int used_with_cas)
+{
+	struct ref *ref;
+	for (ref = remote_refs; ref; ref = ref->next) {
+		/*
+		 * If "compare-and-swap" is used along with option, run the
+		 * check on refs that have been marked to do so. Otherwise,
+		 * all refs will be checked.
+		 */
+		if (used_with_cas)
+			ref->if_includes = ref->is_tracking;
+		else
+			ref->if_includes = 1;
+
+		check_reflog_for_ref(ref);
+	}
+}
diff --git a/remote.h b/remote.h
index 5e3ea5a26d..1618ba892b 100644
--- a/remote.h
+++ b/remote.h
@@ -104,7 +104,10 @@  struct ref {
 		forced_update:1,
 		expect_old_sha1:1,
 		exact_oid:1,
-		deletion:1;
+		deletion:1,
+		if_includes:1, /* If "--force-with-includes" was specified.  */
+		is_tracking:1, /* If "use_tracking[_for_rest]" is set (CAS). */
+		unreachable:1; /* For "if_includes"; unreachable in reflog.  */
 
 	enum {
 		REF_NOT_MATCHED = 0, /* initial value */
@@ -134,6 +137,7 @@  struct ref {
 		REF_STATUS_REJECT_NEEDS_FORCE,
 		REF_STATUS_REJECT_STALE,
 		REF_STATUS_REJECT_SHALLOW,
+		REF_STATUS_REJECT_REMOTE_UPDATED,
 		REF_STATUS_UPTODATE,
 		REF_STATUS_REMOTE_REJECT,
 		REF_STATUS_EXPECTING_REPORT,
@@ -346,4 +350,12 @@  int parseopt_push_cas_option(const struct option *, const char *arg, int unset);
 int is_empty_cas(const struct push_cas_option *);
 void apply_push_cas(struct push_cas_option *, struct remote *, struct ref *);
 
+/*
+ * Runs when "--force-if-includes" is specified.
+ * Checks if the remote-tracking ref was updated (since checkout)
+ * implicitly in the background and verify that changes from the
+ * updated tip have been integrated locally, before pushing.
+ */
+void apply_push_force_if_includes(struct ref*, int);
+
 #endif