[v2,2/2] remote.c: fix handling of %(push:remoteref)

Message ID	20200303161223.1870298-3-damien.olivier.robert+git@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=wY2r=4U=vger.kernel.org=git-owner@kernel.org> From: Damien Robert <damien.olivier.robert@gmail.com> To: git@vger.kernel.org, Jeff King <peff@peff.net> Cc: Damien Robert <damien.olivier.robert+git@gmail.com> Subject: [PATCH v2 2/2] remote.c: fix handling of %(push:remoteref) Date: Tue, 3 Mar 2020 17:12:23 +0100 Message-Id: <20200303161223.1870298-3-damien.olivier.robert+git@gmail.com> In-Reply-To: <20200303161223.1870298-1-damien.olivier.robert+git@gmail.com> References: <20200302133217.GA1176622@coredump.intra.peff.net> <20200303161223.1870298-1-damien.olivier.robert+git@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: git-owner@vger.kernel.org Precedence: bulk
Series	\| expand [v2,0/2] [v2,1/2] remote: drop "explicit" parameter from remote_ref_for_branch() [v2,2/2] remote.c: fix handling of %(push:remoteref)

Damien Robert March 3, 2020, 4:12 p.m. UTC

Looking at the value of %(push:remoteref) only handles the case when an
explicit push refspec is passed. But it does not handle the fallback
cases of looking at the configuration value of `push.default`.

In particular, doing something like

    git config push.default current
    git for-each-ref --format='%(push)'
    git for-each-ref --format='%(push:remoteref)'

prints a useful tracking ref for the first for-each-ref, but an empty
string for the second.

Since the intention of %(push:remoteref), from 9700fae5ee (for-each-ref:
let upstream/push report the remote ref name) is to get exactly which
branch `git push` will push to, even in the fallback cases, fix this.

To get the meaning of %(push:remoteref), `ref-filter.c` calls
`remote_ref_for_branch`. We simply add a new static helper function,
`branch_get_push_remoteref` that follows the logic of
`branch_get_push_1`, and call it from `remote_ref_for_branch`.

We also update t/6300-for-each-ref.sh to handle all `push.default`
strategies. This involves testing `push.default=simple` twice, once
where there is a matching upstream branch and once when there is none.

Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
---
 remote.c                | 106 +++++++++++++++++++++++++++++++---------
 t/t6300-for-each-ref.sh |  29 ++++++++++-
 2 files changed, 112 insertions(+), 23 deletions(-)

Damien Robert March 3, 2020, 4:29 p.m. UTC | #1

I have some remarks/questions:

From Damien Robert, Tue 03 Mar 2020 at 17:12:23 (+0100) :
> +	if (remote->push.nr) {
> +		char *dst;
> +		dst = apply_refspecs(&remote->push, branch->refname);
> +		if (!dst)
> +			return NULL;
> +		return dst;
> +	}

Should I simply `return apply_refspecs(&remote->push, branch->refname);`
here, or is it a good form to always check for a NULL return value even if
we do nothing with it?

> +	case PUSH_DEFAULT_MATCHING:
> +	case PUSH_DEFAULT_CURRENT:
> +		return branch->refname;

Here I follow the logic of branch_get_push1, but the case of
push.default=matching is not quite correct, because we never check
that we have a matching remote branch. On the other hand we cannot check
this until we contact the remote, so I don't know how we could get around
that.

Junio C Hamano March 3, 2020, 6:21 p.m. UTC | #2

Damien Robert <damien.olivier.robert@gmail.com> writes:

> Looking at the value of %(push:remoteref) only handles the case when an
> explicit push refspec is passed. But it does not handle the fallback
> cases of looking at the configuration value of `push.default`.
>
> In particular, doing something like
>
>     git config push.default current
>     git for-each-ref --format='%(push)'
>     git for-each-ref --format='%(push:remoteref)'
>
> prints a useful tracking ref for the first for-each-ref, but an empty
> string for the second.
>
> Since the intention of %(push:remoteref), from 9700fae5ee (for-each-ref:
> let upstream/push report the remote ref name) is to get exactly which
> branch `git push` will push to, even in the fallback cases, fix this.
>
> To get the meaning of %(push:remoteref), `ref-filter.c` calls
> `remote_ref_for_branch`. We simply add a new static helper function,
> `branch_get_push_remoteref` that follows the logic of
> `branch_get_push_1`, and call it from `remote_ref_for_branch`.
>
> We also update t/6300-for-each-ref.sh to handle all `push.default`
> strategies. This involves testing `push.default=simple` twice, once
> where there is a matching upstream branch and once when there is none.
>
> Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com>
> ---
>  remote.c                | 106 +++++++++++++++++++++++++++++++---------
>  t/t6300-for-each-ref.sh |  29 ++++++++++-
>  2 files changed, 112 insertions(+), 23 deletions(-)
>
> diff --git a/remote.c b/remote.c
> index c43196ec06..b3ce992d01 100644
> --- a/remote.c
> +++ b/remote.c
> @@ -516,28 +516,6 @@ const char *pushremote_for_branch(struct branch *branch, int *explicit)
>  	return remote_for_branch(branch, explicit);
>  }
>  
> -const char *remote_ref_for_branch(struct branch *branch, int for_push)
> -{
> -	if (branch) {
> -		if (!for_push) {
> -			if (branch->merge_nr) {
> -				return branch->merge_name[0];
> -			}
> -		} else {
> -			const char *dst, *remote_name =
> -				pushremote_for_branch(branch, NULL);
> -			struct remote *remote = remote_get(remote_name);
> -
> -			if (remote && remote->push.nr &&
> -			    (dst = apply_refspecs(&remote->push,
> -						  branch->refname))) {
> -				return dst;
> -			}
> -		}
> -	}
> -	return NULL;
> -}

Mental note: this function was moved down, the main part of the
logic extracted to a new branch_get_push_remoteref() helper, which
in addition got extended.

> @@ -1656,6 +1634,76 @@ static const char *tracking_for_push_dest(struct remote *remote,
>  	return ret;
>  }
>  
> +/**
> + * Return the local name of the remote tracking branch, as in
> + * %(push:remoteref), that corresponds to the ref we would push to given a
> + * bare `git push` while `branch` is checked out.
> + * See also branch_get_push_1 below.
> + */
> +static const char *branch_get_push_remoteref(struct branch *branch)
> +{
> +	struct remote *remote;
> +
> +	remote = remote_get(pushremote_for_branch(branch, NULL));
> +	if (!remote)
> +		return NULL;
> +
> +	if (remote->push.nr) {
> +		char *dst;
> +
> +		dst = apply_refspecs(&remote->push, branch->refname);
> +		if (!dst)
> +			return NULL;
> +
> +		return dst;
> +	}

That's a fairly expensive way to write

	if (remote->push.nr)
		return apply_refspecs(&remote->push, branch->refname);

one-liner.

In any case, up to this point, the code does exactly the same thing
as the original (i.e. when remote.<remotename>.push is set and
covers the current branch, use that to figure out which way we are
pushing).

> +	if (remote->mirror)
> +		return branch->refname;

If mirroring, we push to the same name, OK.

> +	switch (push_default) {
> +	case PUSH_DEFAULT_NOTHING:
> +		return NULL;
> +
> +	case PUSH_DEFAULT_MATCHING:
> +	case PUSH_DEFAULT_CURRENT:
> +		return branch->refname;

These three cases are straight-forward, I think.

> +	case PUSH_DEFAULT_UPSTREAM:
> +		{
> +			if (!branch || !branch->merge ||
> +			    !branch->merge[0] || !branch->merge[0]->dst)
> +			return NULL;
> +
> +			return branch->merge[0]->src;
> +		}

This is strangely indented and somewhat unreadable.  Why isn't this
more like:

	case PUSH_DEFAULT_UPSTREAM:
		if (branch && branch->merge && branch->merge[0] &&
		    branch->merge[0]->dst)
			return branch->merge[0]->src;
		break;

and have "return NULL" after the switch() statement before we leave
the function?

> +
> +	case PUSH_DEFAULT_UNSPECIFIED:
> +	case PUSH_DEFAULT_SIMPLE:
> +		{
> +			const char *up, *cur;
> +
> +			up = branch_get_upstream(branch, NULL);
> +			if (!up)
> +				return NULL;
> +			cur = tracking_for_push_dest(remote, branch->refname, NULL);
> +			if (!cur)
> +				return NULL;
> +			if (strcmp(cur, up))
> +				return NULL;

This is probably not all that performance critical, so

			up = branch_get_upstream(branch, NULL);
			current = tracking_for_push_dest(remote, branch->refname, NULL);
			if (!up || !current || strcmp(current, up))
				return NULL;

might be easier to follow.

> +			return branch->refname;

> +		}
> +	}
> +
> +	BUG("unhandled push situation");

This is better done / easier to read inside switch() as default:
clause.

By the way, I have a bit higher-level question.  

All of the above logic that decides what should happen in "git push"
MUST have existing code we already use to implement "git push", no?

Why do we need to reinvent it here, instead of reusing the existing
code?  Is it because the interface into the functions that implement
the existing logic is very different from what this function wants?

Junio C Hamano March 3, 2020, 6:29 p.m. UTC | #3

Damien Robert <damien.olivier.robert@gmail.com> writes:

>> +	case PUSH_DEFAULT_MATCHING:
>> +	case PUSH_DEFAULT_CURRENT:
>> +		return branch->refname;
>
> Here I follow the logic of branch_get_push1, but the case of
> push.default=matching is not quite correct, because we never check
> that we have a matching remote branch. On the other hand we cannot check
> this until we contact the remote, so I don't know how we could get around
> that.

Quite honestly, I do not think that is a problem that needs to be
solved; there is no workable definition.

Damien Robert March 3, 2020, 10:24 p.m. UTC | #4

From Junio C Hamano, Tue 03 Mar 2020 at 10:21:31 (-0800) :
> Mental note: this function was moved down, the main part of the
> logic extracted to a new branch_get_push_remoteref() helper, which
> in addition got extended.

Exactly.

> That's a fairly expensive way to write
> 	if (remote->push.nr)
> 		return apply_refspecs(&remote->push, branch->refname);
> one-liner.

You anticipated the question I asked afterwards :)

> > +	case PUSH_DEFAULT_UPSTREAM:
> > +		{
> > +			if (!branch || !branch->merge ||
> > +			    !branch->merge[0] || !branch->merge[0]->dst)
> > +			return NULL;
> > +
> > +			return branch->merge[0]->src;
> > +		}
> 
> This is strangely indented and somewhat unreadable.  Why isn't this
> more like:

Sorry I missed the indentation for the return NULL.
 
> 	case PUSH_DEFAULT_UPSTREAM:
> 		if (branch && branch->merge && branch->merge[0] &&
> 		    branch->merge[0]->dst)
> 			return branch->merge[0]->src;
> 		break;
> 
> and have "return NULL" after the switch() statement before we leave
> the function?

We could, I agree this is more readable. If we return NULL after the
switch, we need to put the
> > +	BUG("unhandled push situation")
in a default clause a you say.

The reason I wrote the code as above is to be as close as possible to
`branch_get_push_1`. If we make the changes you suggest, I'll probably need
a preliminary patch to change `branch_get_push_1` accordingly.

> > +	case PUSH_DEFAULT_UNSPECIFIED:
> > +	case PUSH_DEFAULT_SIMPLE:
> > +		{
> > +			const char *up, *cur;
> > +
> > +			up = branch_get_upstream(branch, NULL);
> > +			if (!up)
> > +				return NULL;
> > +			cur = tracking_for_push_dest(remote, branch->refname, NULL);
> > +			if (!cur)
> > +				return NULL;
> > +			if (strcmp(cur, up))
> > +				return NULL;
> 
> This is probably not all that performance critical, so
> 			up = branch_get_upstream(branch, NULL);
> 			current = tracking_for_push_dest(remote, branch->refname, NULL);
> 			if (!up || !current || strcmp(current, up))
> 				return NULL;
> might be easier to follow.

I don't mind but likewise in this case we should probably change
branch_get_push_1 too.

> By the way, I have a bit higher-level question.  
> 
> All of the above logic that decides what should happen in "git push"
> MUST have existing code we already use to implement "git push", no?

Yes.

> Why do we need to reinvent it here, instead of reusing the existing
> code?  Is it because the interface into the functions that implement
> the existing logic is very different from what this function wants?

Mostly yes. The logic of git push is to massage the refspecs directly, for
instance:
	case PUSH_DEFAULT_MATCHING:
		refspec_append(&rs, ":");
	case PUSH_DEFAULT_CURRENT:
		...
		strbuf_addf(&refspec, "%s:%s", branch->refname, branch->refname);
	case PUSH_DEFAULT_UPSTREAM:
		...
		strbuf_addf(&refspec, "%s:%s", branch->refname, branch->merge[0]->src);

And the error messages are also not the same, and to give a good error
message we need to parse the different cases.

It may be possible to refactorize all this, but not in an obvious way and
it would be a lot more work than this patch series.

Junio C Hamano March 3, 2020, 10:48 p.m. UTC | #5

Damien Robert <damien.olivier.robert@gmail.com> writes:

>> By the way, I have a bit higher-level question.  
>> 
>> All of the above logic that decides what should happen in "git push"
>> MUST have existing code we already use to implement "git push", no?
>
> Yes.
>
>> Why do we need to reinvent it here, instead of reusing the existing
>> code?  Is it because the interface into the functions that implement
>> the existing logic is very different from what this function wants?
>
> Mostly yes. The logic of git push is to massage the refspecs directly, for
> instance:
> 	case PUSH_DEFAULT_MATCHING:
> 		refspec_append(&rs, ":");
> 	case PUSH_DEFAULT_CURRENT:
> 		...
> 		strbuf_addf(&refspec, "%s:%s", branch->refname, branch->refname);
> 	case PUSH_DEFAULT_UPSTREAM:
> 		...
> 		strbuf_addf(&refspec, "%s:%s", branch->refname, branch->merge[0]->src);
>
> And the error messages are also not the same, and to give a good error
> message we need to parse the different cases.
>
> It may be possible to refactorize all this, but not in an obvious way and
> it would be a lot more work than this patch series.

Yeah, in light of the analysis I agree it makes sense to take the
approach of these two patches, at least for now.

Thanks.

[v2,2/2] remote.c: fix handling of %(push:remoteref)

Commit Message

Comments

Patch