Message ID | 20200228172455.1734888-1-damien.olivier.robert+git@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/1] remote.c: fix handling of push:remote_ref | expand |
On Fri, Feb 28, 2020 at 06:24:55PM +0100, Damien Robert wrote: > To get the meaning of push:remoteref, ref-filter.c calls > remote_ref_for_branch. > > However remote_ref_for_branch only handles the case of a specified refspec. > The other cases treated by branch_get_push_1 are the mirror case, > PUSH_DEFAULT_{NOTHING,MATCHING,CURRENT,UPSTREAM,UNSPECIFIED,SIMPLE}. Just to back up a minute to the user-visible problem, it's that: git config push.default matching git for-each-ref --format='%(push)' git for-each-ref --format='%(push:remoteref)' prints a useful tracking ref for the first for-each-ref, but an empty string for the second. That feature (and remote_ref_for_branch) come from 9700fae5ee (for-each-ref: let upstream/push report the remote ref name, 2017-11-07). Author cc'd for guidance. I wonder if %(upstream:remoteref) has similar problems, but I suppose not (it doesn't have this implicit config, so we'd always either have a remote ref or we'd have no upstream at all). > In all these cases, either there is no push remote, or the remote_ref is > branch->refname. So we can handle all these cases by returning > branch->refname, provided that remote is not empty. In the case of "upstream", the names could be different, couldn't they? If I do this: git init parent git -C parent commit --allow-empty -m foo git clone parent child cd child git branch --track mybranch origin/master git config push.default upstream git for-each-ref \ --format='push=%(push), remoteref=%(push:remoteref)' \ refs/heads/mybranch the current code gives no remoteref value, which seems wrong. But with your patch I'd get "refs/heads/mybranch", which is also wrong. I think you're right that all of the other cases would always use the same refname on the remote. > remote.c | 5 +++++ > 1 file changed, 5 insertions(+) We'd want some test coverage to make sure this doesn't regress. There are already some tests covering this feature in t6300. And indeed, your patch causes them to fail when checking a "simple" push case (but I think I'd argue the current expected value there is wrong). That should be expanded to cover the "upstream" case, too, once we figure out how to get it right. > diff --git a/remote.c b/remote.c > index 593ce297ed..75e42b1e36 100644 > --- a/remote.c > +++ b/remote.c > @@ -538,6 +538,11 @@ const char *remote_ref_for_branch(struct branch *branch, int for_push, > *explicit = 1; > return dst; > } > + else if (remote) { > + if (explicit) > + *explicit = 1; > + return branch->refname; > + } Saying "*explicit = 1" here seems weird. Isn't the whole point that these modes _aren't_ explicit? It looks like our only caller will ignore our return value unless we say "explicit", though. I have to wonder what the point of that flag is, versus just returning NULL when we don't have anything to return. -Peff
First I apologize for sending the patch too soon, I forgot to put the RFC flag, this is not a complete patch, and I do intend to fix the tests. I was aware of this issue since my patch about push:track, but since I don't use push:remoteref this was a low priority to fix. Last friday I was sick and could not work, so I took the opportunity to scratch this itch. From Jeff King, Fri 28 Feb 2020 at 13:23:49 (-0500) : > Just to back up a minute to the user-visible problem, it's that: > git config push.default matching > git for-each-ref --format='%(push)' > git for-each-ref --format='%(push:remoteref)' > prints a useful tracking ref for the first for-each-ref, but an empty > string for the second. That feature (and remote_ref_for_branch) come > from 9700fae5ee (for-each-ref: let upstream/push report the remote ref > name, 2017-11-07). Author cc'd for guidance. Yes exactly. > I wonder if %(upstream:remoteref) has similar problems, but I suppose > not (it doesn't have this implicit config, so we'd always either have a > remote ref or we'd have no upstream at all). And the code is different. upstream:remoteref uses branch->merge_name[0]. This is due to the fact that the branch struct stores different things for merge branches than for push branches (which hurt my sense of symmetry :)). > > In all these cases, either there is no push remote, or the remote_ref is > > branch->refname. So we can handle all these cases by returning > > branch->refname, provided that remote is not empty. > In the case of "upstream", the names could be different, couldn't they? Yes of course. Moreover there is also the case of 'nothing' where we should not return the branch name (so what we should test for in the other cases is not the existence of `remote` but of `branch->push_remote_ref`.) > We'd want some test coverage to make sure this doesn't regress. There > are already some tests covering this feature in t6300. And indeed, your > patch causes them to fail when checking a "simple" push case (but I > think I'd argue the current expected value there is wrong). That should > be expanded to cover the "upstream" case, too, once we figure out how to > get it right. Yes, I'll do both simple and upstream for tests I think. > > diff --git a/remote.c b/remote.c > > index 593ce297ed..75e42b1e36 100644 > > --- a/remote.c > > +++ b/remote.c > > @@ -538,6 +538,11 @@ const char *remote_ref_for_branch(struct branch *branch, int for_push, > > *explicit = 1; > > return dst; > > } > > + else if (remote) { > > + if (explicit) > > + *explicit = 1; > > + return branch->refname; > > + } > > Saying "*explicit = 1" here seems weird. Isn't the whole point that > these modes _aren't_ explicit? Well pushremote_for_branch also set explicit=1 if only remote.pushDefault is set, so I followed suit. > It looks like our only caller will ignore our return value unless we say > "explicit", though. I have to wonder what the point of that flag is, > versus just returning NULL when we don't have anything to return. I think you looked at the RR_REMOTE_NAME (ref-filter.c:1455), here the situation is handled by RR_REMOTE_REF, where explicit is not used at all. So we could remove it. So it remains the problem of handling the 'upstream' case. The ideal solution would be to not duplicate branch_get_push_1. In most of the case, this function finds `dst` which is exactly the push:remoteref we are looking for. Then branch_get_push_1 uses ret = tracking_for_push_dest(remote, dst, err); which simply calls ret = apply_refspecs(&remote->fetch, dst); The only different case is case PUSH_DEFAULT_UPSTREAM: return branch_get_upstream(branch, err); which returns branch->merge[0]->dst So we could almost write an auxiliary function that returns push:remoteref and use it both in remote_ref_for_branch and branch_get_push_1 (via a further call to tracking_for_push_dest) except for the 'upstream' case which is subtly different. In the 'upstream' case, the auxiliary function would return branch->merge_name[0]. So the question is: can tracking_for_push_dest(branch->merge_name[0]) be different from branch->merge[0]->dst? Now branch->merge is set in `set_merge`, where it is constructed via if (dwim_ref(ret->merge_name[i], strlen(ret->merge_name[i]), &oid, &ref) == 1) And I don't understand dwim_ref enough to know if there could be differences in our setting from tracking_for_push_dest in corner cases. Another solution could be as follow: we already store `push` in `branch->push_tracking_ref`. So the question is: can we always easily convert something like refs/remotes/origin/branch_name to refs/heads/branch_name (ie essentially reverse ètracking_for_push_dest`), or are there corner cases? Otherwise a simple but not elegant solution would be to copy paste the code of branch_get_push_1 to remote_ref_for_branch, simply removing the calls to `tracking_for_push_dest` and using remote->branch_name[0] rather than remote->branch[0]->dst for the upstream case.
[dropping J from cc, since my earlier email bounced] On Sun, Mar 01, 2020 at 11:05:31PM +0100, Damien Robert wrote: > > Saying "*explicit = 1" here seems weird. Isn't the whole point that > > these modes _aren't_ explicit? > > Well pushremote_for_branch also set explicit=1 if only remote.pushDefault > is set, so I followed suit. Yeah, I think the useless "explicit" was a mistake back when the function was added. See the patch below. > > It looks like our only caller will ignore our return value unless we say > > "explicit", though. I have to wonder what the point of that flag is, > > versus just returning NULL when we don't have anything to return. > > I think you looked at the RR_REMOTE_NAME (ref-filter.c:1455), here the > situation is handled by RR_REMOTE_REF, where explicit is not used at all. > So we could remove it. We do look at it, but it's pointless to do so: $ git grep -hn -C4 remote_ref_for_branch origin:ref-filter.c 1461- } else if (atom->u.remote_ref.option == RR_REMOTE_REF) { 1462- int explicit; 1463- const char *merge; 1464- 1465: merge = remote_ref_for_branch(branch, atom->u.remote_ref.push, 1466- &explicit); 1467- *s = xstrdup(explicit ? merge : ""); 1468- } else 1469- BUG("unhandled RR_* enum"); I think we probably ought to do this as a preparatory patch in your series. -- >8 -- Subject: remote: drop "explicit" parameter from remote_ref_for_branch() Commit 9700fae5ee (for-each-ref: let upstream/push report the remote ref name, 2017-11-07) added a remote_ref_for_branch() helper, which is modeled after remote_for_branch(). This includes providing an "explicit" out-parameter that tells the caller whether the remote was configured by the user, or whether we picked a default name like "origin". But unlike remote names, there's no default case for the remote branch name. In any case where we don't set "explicit", we'd just an empty string anyway. Let's instead return NULL in this case, letting us simplify the function interface. Signed-off-by: Jeff King <peff@peff.net> --- ref-filter.c | 6 ++---- remote.c | 11 ++--------- remote.h | 3 +-- 3 files changed, 5 insertions(+), 15 deletions(-) diff --git a/ref-filter.c b/ref-filter.c index 6867e33648..9837700732 100644 --- a/ref-filter.c +++ b/ref-filter.c @@ -1459,12 +1459,10 @@ static void fill_remote_ref_details(struct used_atom *atom, const char *refname, remote_for_branch(branch, &explicit); *s = xstrdup(explicit ? remote : ""); } else if (atom->u.remote_ref.option == RR_REMOTE_REF) { - int explicit; const char *merge; - merge = remote_ref_for_branch(branch, atom->u.remote_ref.push, - &explicit); - *s = xstrdup(explicit ? merge : ""); + merge = remote_ref_for_branch(branch, atom->u.remote_ref.push); + *s = xstrdup(merge ? merge : ""); } else BUG("unhandled RR_* enum"); } diff --git a/remote.c b/remote.c index 593ce297ed..c43196ec06 100644 --- a/remote.c +++ b/remote.c @@ -516,14 +516,11 @@ const char *pushremote_for_branch(struct branch *branch, int *explicit) return remote_for_branch(branch, explicit); } -const char *remote_ref_for_branch(struct branch *branch, int for_push, - int *explicit) +const char *remote_ref_for_branch(struct branch *branch, int for_push) { if (branch) { if (!for_push) { if (branch->merge_nr) { - if (explicit) - *explicit = 1; return branch->merge_name[0]; } } else { @@ -534,15 +531,11 @@ const char *remote_ref_for_branch(struct branch *branch, int for_push, if (remote && remote->push.nr && (dst = apply_refspecs(&remote->push, branch->refname))) { - if (explicit) - *explicit = 1; return dst; } } } - if (explicit) - *explicit = 0; - return ""; + return NULL; } static struct remote *remote_get_1(const char *name, diff --git a/remote.h b/remote.h index b134cc21be..11d8719b58 100644 --- a/remote.h +++ b/remote.h @@ -261,8 +261,7 @@ struct branch { struct branch *branch_get(const char *name); const char *remote_for_branch(struct branch *branch, int *explicit); const char *pushremote_for_branch(struct branch *branch, int *explicit); -const char *remote_ref_for_branch(struct branch *branch, int for_push, - int *explicit); +const char *remote_ref_for_branch(struct branch *branch, int for_push); /* returns true if the given branch has merge configuration given. */ int branch_has_merge_config(struct branch *branch);
On Sun, Mar 01, 2020 at 11:05:31PM +0100, Damien Robert wrote: > So it remains the problem of handling the 'upstream' case. > The ideal solution would be to not duplicate branch_get_push_1. Yeah, that would be nice (though at least if it's all contained in remote.c, we can live with some duplication). There's already some duplication in the way remote_ref_for_branch() applies remote refspecs. And I think all of this may be duplicated with git-push itself (which would also be nice to get rid of, but last time I looked into it was hard to refactor it to do so). > In most of the case, this function finds `dst` which is exactly the > push:remoteref we are looking for. > > Then branch_get_push_1 uses > ret = tracking_for_push_dest(remote, dst, err); > which simply calls > ret = apply_refspecs(&remote->fetch, dst); Right, there we already have the remote name, and are applying the fetch refspecs to know what our tracking branch would be. So in remote_ref_for_branch(), we'd just not apply those. > The only different case is > case PUSH_DEFAULT_UPSTREAM: > return branch_get_upstream(branch, err); > which returns > branch->merge[0]->dst We also have PUSH_DEFAULT_NOTHING, for which obviously we'd return nothing (NULL or an empty string). Likewise for SIMPLE, we probably need to check that the upstream has a matching name (and return nothing if not). > So we could almost write an auxiliary function that returns push:remoteref > and use it both in remote_ref_for_branch and branch_get_push_1 (via a > further call to tracking_for_push_dest) except for the 'upstream' case > which is subtly different. Yes, that makes sense. > In the 'upstream' case, the auxiliary function would return > branch->merge_name[0]. So the question is: can > tracking_for_push_dest(branch->merge_name[0]) be different from > branch->merge[0]->dst? Those will both return tracking refs. I think you just want merge[0]->src for the upstream case. And yes, the two can be different. It's the same case as when the upstream branch has a different name than the current branch. > Another solution could be as follow: we already store `push` in > `branch->push_tracking_ref`. So the question is: can we always easily convert > something like refs/remotes/origin/branch_name to refs/heads/branch_name > (ie essentially reverse ètracking_for_push_dest`), or are there corner cases? This would basically be reverse-applying the fetch refspec. In theory it should be possible, but there are cases where somebody has overlapping refspecs. But at any rate, I think it's better to just get the pre-mapped values (i.e., avoid calling tracking_for_push_dest() in the first place). > Otherwise a simple but not elegant solution would be to copy paste the > code of branch_get_push_1 to remote_ref_for_branch, simply removing the > calls to `tracking_for_push_dest` and using remote->branch_name[0] rather > than remote->branch[0]->dst for the upstream case. Yeah, I think that's going to be the easiest. It would be nice to avoid repeating that switch(), but frankly I think the boilerplate you'll end up with trying to handle the two cases may be worse than just repeating it. It may be worth adding a comment to each function to mention the other, and that any changes need to match. -Peff
From Jeff King, Mon 02 Mar 2020 at 08:32:17 (-0500) : > > I think you looked at the RR_REMOTE_NAME (ref-filter.c:1455), here the > > situation is handled by RR_REMOTE_REF, where explicit is not used at all. > > So we could remove it. > > We do look at it, but it's pointless to do so: Oh sorry, I don't know how I missed this line while I saw it above. > > $ git grep -hn -C4 remote_ref_for_branch origin:ref-filter.c > 1461- } else if (atom->u.remote_ref.option == RR_REMOTE_REF) { > 1462- int explicit; > 1463- const char *merge; > 1464- > 1465: merge = remote_ref_for_branch(branch, atom->u.remote_ref.push, > 1466- &explicit); > 1467- *s = xstrdup(explicit ? merge : ""); > 1468- } else > 1469- BUG("unhandled RR_* enum"); > > I think we probably ought to do this as a preparatory patch in your > series. I wonder about the case of RR_REMOTE_NAME to. We always have explicit=1, except if we fallback all the way to 'origin', via pushremote_for_branch and then remote_for_branch. But 'origin' even through it is implicit, is still the name of the remote we fetch/push to by default. So should not %(push), %(upstream) still show origin in this case?
From Jeff King, Mon 02 Mar 2020 at 08:48:42 (-0500) : > And I think all of this may be duplicated with git-push itself (which > would also be nice to get rid of, but last time I looked into it was > hard to refactor it to do so). I had a quick look at git-push but the duplication does not seems too bad. > > In the 'upstream' case, the auxiliary function would return > > branch->merge_name[0]. So the question is: can > > tracking_for_push_dest(branch->merge_name[0]) be different from > > branch->merge[0]->dst? > Those will both return tracking refs. I think you just want > merge[0]->src for the upstream case. > And yes, the two can be different. It's the same case as when the > upstream branch has a different name than the current branch. I meant, now that we have branch_get_push_remoteref, can we replace the body of branch_get_push_1 by remote = remote_get(pushremote_for_branch(branch, NULL)); ret = tracking_for_push_dest(remote, branch_get_push_remoteref(branch), err); (we would need to add error handling in branch_get_push_remoteref but that is easy) Currently that is exactly what branch_get_push_1 does, except in the PUSH_DEFAULT_UPSTREAM where it returns branch->merge[0]->dst. But branch->merge is set up in `set_merge`, where we have: ret->merge[i]->src = xstrdup(ret->merge_name[i]); ... if (dwim_ref(ret->merge_name[i], strlen(ret->merge_name[i]), &oid, &ref) == 1) ret->merge[i]->dst = ref; So my question was: can dwim_ref(branch->merge[0]->src) be different from tracking_for_push_dest(branch->merge[0]->src)? > Yeah, I think that's going to be the easiest. It would be nice to avoid > repeating that switch(), but frankly I think the boilerplate you'll end > up with trying to handle the two cases may be worse than just repeating > it. That's what I went with. We can always refactorise branch_get_push_1 to use branch_get_push_remoteref afterwards. > It may be worth adding a comment to each function to mention the > other, and that any changes need to match. I tried to add a comment, but I don't know if it is helpful enough.
diff --git a/remote.c b/remote.c index 593ce297ed..75e42b1e36 100644 --- a/remote.c +++ b/remote.c @@ -538,6 +538,11 @@ const char *remote_ref_for_branch(struct branch *branch, int for_push, *explicit = 1; return dst; } + else if (remote) { + if (explicit) + *explicit = 1; + return branch->refname; + } } } if (explicit)
To get the meaning of push:remoteref, ref-filter.c calls remote_ref_for_branch. However remote_ref_for_branch only handles the case of a specified refspec. The other cases treated by branch_get_push_1 are the mirror case, PUSH_DEFAULT_{NOTHING,MATCHING,CURRENT,UPSTREAM,UNSPECIFIED,SIMPLE}. In all these cases, either there is no push remote, or the remote_ref is branch->refname. So we can handle all these cases by returning branch->refname, provided that remote is not empty. Signed-off-by: Damien Robert <damien.olivier.robert+git@gmail.com> --- remote.c | 5 +++++ 1 file changed, 5 insertions(+)