mbox series

[v5,0/3] New options to support "simple" centralized workflow

Message ID pull.1161.v5.git.1651226206.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series New options to support "simple" centralized workflow | expand

Message

Johannes Schindelin via GitGitGadget April 29, 2022, 9:56 a.m. UTC
This patchset introduces two new configuration options, intended to be
consistent with and complementary to the push.default "simple" option. It
also improves remote-defaulting in "default push" scenarios.

In some "simple" centralized workflows, users expect remote tracking branch
names to match local branch names. "git push" pushes to the remote
version/instance of the branch, and "git pull" pulls any changes to the
remote branch (changes made by the same user in another place, or by other
users). The default push.default option, "simple", supports this kind of
workflow by "raising eyebrows" if a user looks like they are trying to push
to the "wrong" remote tracking branch.

None of the existing branch.autosetupmerge settings support this
workflow/expectation well, so the new "branch.autosetupmerge=simple" option
addresses this - acting like the default "remote" option only when the
remote branch and (new) local branch have the same name. The new option is
referred to in new advice in the push.default=simple mismatching remote
branch error text.

At a later stage in the new-branch workflow, when a user first goes to push
a new branch to the remote, the default "git push" will complain that there
is no remote tracking branch (unless push.default=current). For users that
always expect remote branch names to match local branch names, on a single
remote, this is inconvenient. New config setting "push.autoSetupRemote"
addresses this by automatically specifying "--set-upstream" (and allowing
the push) when there is no configured remote for push.default options
"simple", "upstream", and "current". In the case of "current", this helps
make "pull" work correctly (under these workflow assumptions). For the other
two options the primary benefit is being able to simply say "git push" and
not be interrupted with an unnecessary "but this is a new branch!" error.

Along the way, we also enhance the remote-defaulting behavior for "git push"
(and ls-remote) to not only support "origin" as the default remote, but
rather any single configured remote. Default push should only fail for lack
of a remote if there are none, or if there is more than one and none are
called "origin".

Changes since v4:

 * Changed patchset subject to "New options to support "simple" centralized
   workflow", reflecting the fact that there are now two new config options
   available
 * Added some advice to the default push "mismatching remote tracking branch
   name" error, offering the new branch.autosetupmerge=simple option, so
   that new users can potentially discover and benefit from it
 * Introduced a new commit improving the defaulting of remote for "default
   push" (and ls-remote), and fixing and adding related tests
 * Introduced a new commit for new config setting push.autoSetupRemote,
   which will avoid the need for users to explicitly push to a specific
   origin, explicitly requesting tracking, when doing a default push for a
   new branch (with advice and tests).
 * Rebased onto current 'master'

Open questions:

 * The exact text of the two new pieces of advice should get some review, it
   is likely improvable
 * The name and config help of the "push.autoSetupRemote" config setting
   should also be reviewed - there is confusion (at least in my mind)
   between "upstream", "remote tracking", and "remote merge" concepts.

Tao Klerks (3):
  branch: new autosetupmerge option 'simple' for matching branches
  push: default to single remote even when not named origin
  push: new config option "push.autoSetupRemote" supports "simple" push

 Documentation/config/branch.txt |  9 ++--
 Documentation/config/push.txt   | 11 +++++
 Documentation/git-branch.txt    | 18 +++++---
 branch.c                        | 27 +++++++++++-
 branch.h                        |  1 +
 builtin/push.c                  | 64 +++++++++++++++++++++------
 config.c                        |  3 ++
 remote.c                        |  2 +
 t/t3200-branch.sh               | 35 +++++++++++++++
 t/t5512-ls-remote.sh            | 17 ++++++--
 t/t5528-push-default.sh         | 77 ++++++++++++++++++++++++++++++++-
 transport.h                     |  1 +
 12 files changed, 237 insertions(+), 28 deletions(-)


base-commit: 6cd33dceed60949e2dbc32e3f0f5e67c4c882e1e
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1161%2FTaoK%2Ffeature-branch-autosetupmerge-simple-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1161/TaoK/feature-branch-autosetupmerge-simple-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1161

Range-diff vs v4:

 1:  eca8ab2eb7b ! 1:  5b08edcdeef merge: new autosetupmerge option 'simple' for matching branches
     @@ Metadata
      Author: Tao Klerks <tao@klerks.biz>
      
       ## Commit message ##
     -    merge: new autosetupmerge option 'simple' for matching branches
     +    branch: new autosetupmerge option 'simple' for matching branches
      
          With the default push.default option, "simple", beginners are
          protected from accidentally pushing to the "wrong" branch in
     @@ Commit message
          a "default push", they get an error and explanation with options.
      
          There is a particular centralized workflow where this often happens:
     -    a user branches to a new local feature branch from an existing
     -    upstream branch, eg with "checkout -b feature1 origin/master". With
     +    a user branches to a new local topic branch from an existing
     +    remote branch, eg with "checkout -b feature1 origin/master". With
          the default branch.autosetupmerge configuration (value "true"), git
     -    will automatically add origin/master as the remote tracking branch.
     +    will automatically add origin/master as the upstream tracking branch.
      
     -    When the user pushes with "git push", they get an error, and (amongst
     -    other things) a suggestion to run "git push origin HEAD". Eventually
     -    they figure out to add "-u" to change the tracking branch, or they set
     -    push.default to "current", or some tooling does one or the other of
     -    these things for them.
     +    When the user pushes with a default "git push", with the intention of
     +    pushing their (new) topic branch to the remote, they get an error, and
     +    (amongst other things) a suggestion to run "git push origin HEAD".
      
     -    When one of their coworkers works on the same branch, they don't get
     -    any of that weirdness. They just "git checkout feature1" and
     -    everything works exactly as they expect, with the shared remote branch
     -    set up as remote tracking branch, and push and pull working out of the
     -    box.
     +    If they follow this suggestion the push succeeds, but on subsequent
     +    default pushes they continue to get an error - so eventually they
     +    figure out to add "-u" to change the tracking branch, or they spelunk
     +    the push.default config doc as proposed and set it to "current", or
     +    some GUI tooling does one or the other of these things for them.
     +
     +    When one of their coworkers later works on the same topic branch,
     +    they don't get any of that "weirdness". They just "git checkout
     +    feature1" and everything works exactly as they expect, with the shared
     +    remote branch set up as remote tracking branch, and push and pull
     +    working out of the box.
      
          The "stable state" for this way of working is that local branches have
          the same-name remote tracking branch (origin/feature1 in this
     @@ Commit message
          the remote "master" branch instead (a completely different class of
          changes!)
      
     -    Any experienced git user will presumably say "well yeah, that's what
     -    it means to have the remote tracking branch set to origin/master!" -
     -    but that user didn't *ask* to have the remote master branch added as
     -    remote tracking branch - that just happened automatically when they
     -    branched their feature branch. They didn't necessarily even notice or
     -    understand the meaning of the "set up to track 'origin/master'"
     +    An experienced git user might say "well yeah, that's what it means to
     +    have the remote tracking branch set to origin/master!" - but the
     +    original user above didn't *ask* to have the remote master branch
     +    added as remote tracking branch - that just happened automatically
     +    when they branched their feature branch. They didn't necessarily even
     +    notice or understand the meaning of the "set up to track 'origin/master'"
          message when they created the branch - especially if they are using a
          GUI.
      
          Looking at how to fix this, you might think "OK, so disable auto setup
          of remote tracking - set branch.autosetupmerge to false" - but that
          will inconvenience the *second* user in this story - the one who just
     -    wanted to start working on the feature branch. The first and second
     +    wanted to start working on the topic branch. The first and second
          users swap roles at different points in time of course - they should
          both have a sane configuration that does the right thing in both
          situations.
      
     -    Make these flows painless by introducing a new branch.autosetupmerge
     -    option called "simple", to match the same-name "push.default" option
     -    that makes similar assumptions.
     +    Make this "branches have the same name locally as on the remote"
     +    workflow less painful / more obvious by introducing a new
     +    branch.autosetupmerge option called "simple", to match the same-name
     +    "push.default" option that makes similar assumptions.
      
          This new option automatically sets up tracking in a *subset* of the
          current default situations: when the original ref is a remote tracking
          branch *and* has the same branch name on the remote (as the new local
          branch name).
      
     +    Update the error displayed when the 'push.default=simple' configuration
     +    rejects a mismatching-upstream-name default push, to offer this new
     +    branch.autosetupmerge option that will prevent this class of error.
     +
          With this new configuration, in the example situation above, the first
          user does *not* get origin/master set up as the tracking branch for
          the new local branch. If they "git pull" in their new local-only
     @@ Documentation/git-branch.txt: The exact upstream branch is chosen depending on t
      
       ## branch.c ##
      @@ branch.c: static int find_tracked_branch(struct remote *remote, void *priv)
     - 			free(tracking->spec.src);
       			string_list_clear(tracking->srcs, 0);
     + 		break;
       		}
      +		/* remote_find_tracking() searches by src if present */
       		tracking->spec.src = NULL;
     @@ branch.c: static void setup_tracking(const char *new_ref, const char *orig_ref,
      +	 * that supports multiple entries in tracking_srcs but
      +	 * leaves tracking.matches at 0.
      +	 */
     - 	if (tracking.matches > 1)
     - 		die(_("not tracking: ambiguous information for ref %s"),
     - 		    orig_ref);
     + 	if (tracking.matches > 1) {
     + 		int status = die_message(_("not tracking: ambiguous information for ref '%s'"),
     + 					    orig_ref);
     +@@ branch.c: static void setup_tracking(const char *new_ref, const char *orig_ref,
     + 		exit(status);
     + 	}
       
      +	if (track == BRANCH_TRACK_SIMPLE) {
      +		/*
     @@ branch.c: static void setup_tracking(const char *new_ref, const char *orig_ref,
       	if (tracking.srcs->nr < 1)
       		string_list_append(tracking.srcs, orig_ref);
       	if (install_branch_config_multiple_remotes(config_flags, new_ref,
     +@@ branch.c: static int submodule_create_branch(struct repository *r,
     + 		/* Default for "git checkout". Do not pass --track. */
     + 	case BRANCH_TRACK_REMOTE:
     + 		/* Default for "git branch". Do not pass --track. */
     ++	case BRANCH_TRACK_SIMPLE:
     ++		/* Config-driven only. Do not pass --track. */
     + 		break;
     + 	}
     + 
      
       ## branch.h ##
      @@ branch.h: enum branch_track {
     @@ branch.h: enum branch_track {
       
       extern enum branch_track git_branch_track;
      
     + ## builtin/push.c ##
     +@@
     +  * "git push"
     +  */
     + #include "cache.h"
     ++#include "branch.h"
     + #include "config.h"
     + #include "refs.h"
     + #include "refspec.h"
     +@@ builtin/push.c: static NORETURN void die_push_simple(struct branch *branch,
     + 	 * upstream to a non-branch, we should probably be showing
     + 	 * them the big ugly fully qualified ref.
     + 	 */
     +-	const char *advice_maybe = "";
     ++	const char *advice_pushdefault_maybe = "";
     ++	const char *advice_automergesimple_maybe = "";
     + 	const char *short_upstream = branch->merge[0]->src;
     + 
     + 	skip_prefix(short_upstream, "refs/heads/", &short_upstream);
     +@@ builtin/push.c: static NORETURN void die_push_simple(struct branch *branch,
     + 	 * push.default.
     + 	 */
     + 	if (push_default == PUSH_DEFAULT_UNSPECIFIED)
     +-		advice_maybe = _("\n"
     ++		advice_pushdefault_maybe = _("\n"
     + 				 "To choose either option permanently, "
     +-				 "see push.default in 'git help config'.");
     ++				 "see push.default in 'git help config'.\n");
     ++	if (git_branch_track != BRANCH_TRACK_SIMPLE)
     ++		advice_automergesimple_maybe = _("\n"
     ++				 "To avoid automatically configuring "
     ++				 "upstream branches when their name\n"
     ++				 "doesn't match the local branch, see option "
     ++				 "'simple' of branch.autosetupmerge\n"
     ++				 "in 'git help config'.\n");
     + 	die(_("The upstream branch of your current branch does not match\n"
     + 	      "the name of your current branch.  To push to the upstream branch\n"
     + 	      "on the remote, use\n"
     +@@ builtin/push.c: static NORETURN void die_push_simple(struct branch *branch,
     + 	      "To push to the branch of the same name on the remote, use\n"
     + 	      "\n"
     + 	      "    git push %s HEAD\n"
     +-	      "%s"),
     ++	      "%s%s"),
     + 	    remote->name, short_upstream,
     +-	    remote->name, advice_maybe);
     ++	    remote->name, advice_pushdefault_maybe,
     ++	    advice_automergesimple_maybe);
     + }
     + 
     + static const char message_detached_head_die[] =
     +
       ## config.c ##
      @@ config.c: static int git_default_branch_config(const char *var, const char *value)
       		} else if (value && !strcmp(value, "inherit")) {
 -:  ----------- > 2:  31184c3a65d push: default to single remote even when not named origin
 -:  ----------- > 3:  41c88e51ac6 push: new config option "push.autoSetupRemote" supports "simple" push

Comments

Junio C Hamano April 29, 2022, 6:50 p.m. UTC | #1
"Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This patchset introduces two new configuration options, intended to be
> consistent with and complementary to the push.default "simple" option. It
> also improves remote-defaulting in "default push" scenarios.

Thanks.  I still do not know offhand if the 'simple' thing makes
sense without thinking it through, but I think that the 'missing
origin is fine and we can use the unique remote if exists' is a
really good idea, especially if some push strategies already do so
and some don't, which seems to be the case.

Will queue.
Tao Klerks April 30, 2022, 3:48 p.m. UTC | #2
On Fri, Apr 29, 2022 at 8:50 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> I still do not know offhand if the 'simple' thing makes
> sense without thinking it through,

At the risk of insisting too much, I'd like to break this down into 3 parts:

1) To what extent does a "there is one remote, and local and remote
branches have the same name unless I explicitly choose to do something
different" perspective make sense to a given population of users, and
how large is that population of users?

2) For those users, can and should we have a better UX?

3) To what extent does it make sense to call this mode of working
"simple", what are the best UX changes to make, what should any new
options be called, and how should discoverability be implemented?


1. Audience:

I understand that git was designed as a distributed VCS, and that's a
completely fundamental aspect of its power and success... but the
reality (or "my claim"?) is that the vast majority of users end up
using git with a single remote per repo. I don't know how to
categorically confirm this - I suspect the github/microsoft, google
and other sponsor-type folks here will have more access to research on
the topic. I don't want to imply that git should do less, but just
that the idea of "multiple remotes" is alien to almost every git user
I've ever interacted with. Obviously as I work in a corporate
environment I have a particular perspective... but I find this to be
true of github users also.

Within the context of such "single remote per repo" users, I've spoken
with a dozen users of varying git experience levels to try to
understand whether *any* of them intentionally end up with an
"upstream tracking branch different from the local branch name"
scenario, and what they use it for. I found two users who had ever
done this intentionally: One who had done it once, when faced with a
project with crazy machine-generated branch names, and another who
does it routinely to have nice short local branch names (very much an
advanced user and enthusiast). To the majority it's only ever happened
by accident, and they didn't even understand what was going on. It was
just a weird message they got and eventually worked around.

Amusingly, one was an old hand, and still avoided the default "git
push" because he remembered a time when that pushed all branches, and
did not realize the default behavior had changed to "current branch,
as long as remote tracking name matches" (aka push.default=simple) 8
years ago.

All these users are aware that there are options that change git's
behavior, but the only one who ever took the time to understand and
consider changing the defaults was the expert user enthusiast.

I realize all this is anecdotal, I'm a hobbyist and a novice in this
community, and the deployment I support only has a few hundred users
at the moment - but surely there must be a way to confirm whether it's
true that git's primary value to millions of users in the world is in
a context where there is a single remote, and branches normally and
intentionally have exactly the same name locally and on that single
remote?

2. Current Experience

Taking the user model/workflow above and its statistical significance
as a given, what's the "problem"?

A) a user can accidentally end up in an unexpected state, and not
easily understand why or what's going on, if they do "git checkout -b
mybranch origin/whatever" - that is, if they choose to branch from a
known remote state, rather than creating a local branch for that
remote branch first. In this unexpected state their "git pull" is not
doing what they expect (it's bringing in changes from a *different*
branch), and their "git push" is not working.

Furthermore, the error message for "git push" is not actually giving
them the right option to solve their problem - it suggests they push
to the same-name remote branch, but does not propose the "-u" option,
because git can't be sure the mismatching branch name isn't
intentional and "-u" would be a kind of destructive change! So they
will remain in this weird/unexpected state unless/until they figure
out for themselves to specify -u or otherwise change the tracking
upstream. I've seen people delete the local branch, and recreate it,
just to sort out the remote tracking, because it's just not obvious to
them what is going on!

Other flows don't have this issue, eg if they first "git checkout
master" (potentially creating a new master branch with tracking from
remote) and then "git checkout -b mybranch".

That inconsistency is part of the problem - it forces affected users
to think about remote tracking branches in a way they shouldn't need
to, in a way that is basically alien to their day-to-day experience
and expectations of the relationship between local and remote.

B) When a user creates a new branch and they want to push it, they get
an error that spits out a magical incantation hint, they repeat the
magical incantation, and then things are working as expected. This is
a lot better than lacking the hint, of course, but is a completely
unnecessary interruption in their workflow, *given the assumption that
remote branches for these users always have the same name as local
branches anyway*. The intention of a default "git push", in this (in
my opinion vast-majority) situation, is simply to make this branch
work with its remote equivalent.

3) Naming & changes to git behaviors

One way to approach the desired flow above would be to do away with or
ignore the concept of upstream tracking branches altogether, and have
a git behavior mode in which "git pull", "git push", and "git status"
all work automatically and consistently with the same-name remote
branch.

I think there are a few problems with that approach:
 - It would not be an on-ramp to slightly different behaviors / modes
of functioning
 - We'd have to figure out what to do with any then-ignored upstream
tracking entries for existing branches
 - It would involve a lot of code changes
 - It would be hard to explain in relation to all the rest of the doc/behaviors
 - A user interested in working with just a single
locally-differently-named branch (eg because they're working on a
server with remote branch names that they can't change and are
inconveniently long, or have complex prexif/namespacing requirements)
would not be able to make use of such a mode - they'd have to switch
to the "full/normal" mode.

Therefore, it makes more sense to figure out the smallest changes in
behavior that lead to meeting the expectations/conveniences above, and
don't prevent still keeping branches that have a different name to the
remote, when that is very explicitly desired & specified.

Hence the proposals in this patch series. I do truly believe that the
two small changes (new "don't auto-track differently-named upstream
branches" option, and new "automatically add remote tracking for
same-name branch if missing" option) are the right thing. What I don't
know, is whether they are *named* in the best possible way, and
whether the text of the proposed "hints" is the best way to help the
(in my opinion) majority of users who will probably benefit from
setting things up this way.


> but I think that the 'missing
> origin is fine and we can use the unique remote if exists' is a
> really good idea,

Cool, that's an easy one, and a separate commit if you want to split
it off. It's a prerequisite for the "push.autoSetupRemote" to work
well (in repos that have a single remote not called "origin"), but it
does not depend on the other proposed changes.

> especially if some push strategies already do so
> and some don't, which seems to be the case.

Not exactly - there are other *commands* that do this kind of "the
single remote" defaulting, but not other push strategies. The reason I
called out only two default push strategies explicitly, is that they
are the ones that can work without a remote tracking branch being
configured at all (as long as there is a remote called origin); the
other strategies depend on a remote being explicitly configured as
push default, or as branch remote, or as branch push remote.

>
> Will queue.

Great thx.