Message ID | 20190421040823.24821-1-alexhenrie24@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] Give git-pull a --reset option | expand |
Alex Henrie <alexhenrie24@gmail.com> writes: > A common workflow is to make a commit on a local branch, push the branch > to the remote, check out the remote branch on a second computer, amend > the commit on the second computer, force-push back to the remote branch, > and finally submit a pull request. However, if the user switches back to > the first computer, they must then run the cumbersome command > `git fetch && git reset --hard origin`. Doesn't anybody sense there is a larger problem if such a workflow is "common" in the first place? In that sequence, when you come back to the first repository there is no guarantee that what you are losing is exactly what you are willing to lose and nothing else (i.e. your earlier WIP you sent to the second repository, which was further polished, making the earlier WIP you had here irrelevant). If the last "recovery at the first repository" step were "pull --rebase", at least you would realize that you have the earlier WIP locally that either (1) conflicts with the more polished work that have been accepted at the upstream, in which case you can tell the rebase machinery to drop that earlier WIP _after_ making sure that it is only that stale WIP you not only are willing to but actively do want to lose that is getting discarded. (2) replays cleanly on top of the updated upstream, which hasn't accepted your pull request made from the second repository with the more polished version, in which case you'd realize that you may have to work on the topic further. And you have a chance to inspect what you ended up with before using "reset --hard" or "rebase -i" to discard what you no longer need. At least, I think the longhand this attempts to replace, "fetch" followed by "reset --hard origin" is better because of two reasons. It is more explicit that the procedure is destructive, and more importantly, it can allow to have (and we should encourage users to make a habit to have) an extra step to inspect what the user is about to lose with "git log origin.." after "fetch" but before "reset --hard". So I have a moderately strong suspicion that "git pull --reset" promotes a wrong workflow and should not exist.
On Sat, Apr 20, 2019 at 11:38 PM Junio C Hamano <gitster@pobox.com> wrote: > > Alex Henrie <alexhenrie24@gmail.com> writes: > > > A common workflow is to make a commit on a local branch, push the branch > > to the remote, check out the remote branch on a second computer, amend > > the commit on the second computer, force-push back to the remote branch, > > and finally submit a pull request. However, if the user switches back to > > the first computer, they must then run the cumbersome command > > `git fetch && git reset --hard origin`. > > Doesn't anybody sense there is a larger problem if such a workflow > is "common" in the first place? In that sequence, when you come > back to the first repository there is no guarantee that what you are > losing is exactly what you are willing to lose and nothing else > (i.e. your earlier WIP you sent to the second repository, which was > further polished, making the earlier WIP you had here irrelevant). You may be right. On the other hand, you're expected to think about what you're doing before running `git push --force` and clobbering a remote branch. Similarly, you would be expected to think about what you're doing before running `git pull --reset` and clobbering a local branch. It's actually easier to recover from accidentally clobbering a local branch than accidentally clobbering a remote branch because you can use `git reflog` to find the lost commits. > If the last "recovery at the first repository" step were "pull --rebase", > at least you would realize that you have the earlier WIP locally > that either > > (1) conflicts with the more polished work that have been > accepted at the upstream, in which case you can tell the > rebase machinery to drop that earlier WIP _after_ making > sure that it is only that stale WIP you not only are willing > to but actively do want to lose that is getting discarded. > > (2) replays cleanly on top of the updated upstream, which hasn't > accepted your pull request made from the second repository > with the more polished version, in which case you'd realize > that you may have to work on the topic further. And you > have a chance to inspect what you ended up with before using > "reset --hard" or "rebase -i" to discard what you no longer > need. I understand that `git pull --rebase` followed by `git rebase --skip` is a safer workflow. I just feel like an option like `git pull --reset` should be there for users who know what they're doing, just like `git push --force` is available for users who know what they're doing. > At least, I think the longhand this attempts to replace, "fetch" > followed by "reset --hard origin" is better because of two reasons. > It is more explicit that the procedure is destructive, and more > importantly, it can allow to have (and we should encourage users to > make a habit to have) an extra step to inspect what the user is > about to lose with "git log origin.." after "fetch" but before > "reset --hard". I'd be happy to emphasize the destructive nature of this option by calling it `git pull --hard-reset` instead. > So I have a moderately strong suspicion that "git pull --reset" > promotes a wrong workflow and should not exist. It'd be great to get some feedback from other Git users, but in the end it's up to you and I trust your decision. -Alex
Hi Alex On 21/04/2019 08:01, Alex Henrie wrote: > On Sat, Apr 20, 2019 at 11:38 PM Junio C Hamano <gitster@pobox.com> wrote: >> Alex Henrie <alexhenrie24@gmail.com> writes: >> >>> A common workflow is to make a commit on a local branch, push the branch >>> to the remote, check out the remote branch on a second computer, amend >>> the commit on the second computer, force-push back to the remote branch, >>> and finally submit a pull request. However, if the user switches back to >>> the first computer, they must then run the cumbersome command >>> `git fetch && git reset --hard origin`. This will be quite a common occurrence especially for personal repos on one of the hosting sites (GitHub, GitLab, etc), so we know that we are the only user of that repo. I've certainly used that style of technique for Windows vs Linux machine transfers of Git work (when my old hack linux machine is functional:-). >> Doesn't anybody sense there is a larger problem if such a workflow >> is "common" in the first place? In that sequence, when you come >> back to the first repository there is no guarantee that what you are >> losing is exactly what you are willing to lose and nothing else >> (i.e. your earlier WIP you sent to the second repository, which was >> further polished, making the earlier WIP you had here irrelevant). I'd agree that a public/joint repository could have issues if a user blindly assumes that no one has cooperated with them and added more commits, but that becomes a social issue (plus worthy of a documentation mention!). There are still a few blind spots in the functionality of Git regarding how users interact with their hosting provider, which is trying to be (from a user perspective) both a backup and an independent repo (side discussion at Git Merge). This would appear to be one of those cases where one is collaborating with no-one (oneself), but from two machines. It may just be that we need to put aside `pull` because of its old semantics, and start with a fresh command name for the 'fetch + sort stuff out' for these use cases. > You may be right. On the other hand, you're expected to think about > what you're doing before running `git push --force` and clobbering a > remote branch. Similarly, you would be expected to think about what > you're doing before running `git pull --reset` and clobbering a local > branch. It's actually easier to recover from accidentally clobbering a > local branch than accidentally clobbering a remote branch because you > can use `git reflog` to find the lost commits. > >> If the last "recovery at the first repository" step were "pull --rebase", >> at least you would realize that you have the earlier WIP locally >> that either >> >> (1) conflicts with the more polished work that have been >> accepted at the upstream, in which case you can tell the >> rebase machinery to drop that earlier WIP _after_ making >> sure that it is only that stale WIP you not only are willing >> to but actively do want to lose that is getting discarded. >> >> (2) replays cleanly on top of the updated upstream, which hasn't >> accepted your pull request made from the second repository >> with the more polished version, in which case you'd realize >> that you may have to work on the topic further. And you >> have a chance to inspect what you ended up with before using >> "reset --hard" or "rebase -i" to discard what you no longer >> need. > I understand that `git pull --rebase` followed by `git rebase --skip` > is a safer workflow. I just feel like an option like `git pull > --reset` should be there for users who know what they're doing, just > like `git push --force` is available for users who know what they're > doing. > >> At least, I think the longhand this attempts to replace, "fetch" >> followed by "reset --hard origin" is better because of two reasons. >> It is more explicit that the procedure is destructive, and more >> importantly, it can allow to have (and we should encourage users to >> make a habit to have) an extra step to inspect what the user is >> about to lose with "git log origin.." after "fetch" but before >> "reset --hard". > I'd be happy to emphasize the destructive nature of this option by > calling it `git pull --hard-reset` instead. > >> So I have a moderately strong suspicion that "git pull --reset" >> promotes a wrong workflow and should not exist. > It'd be great to get some feedback from other Git users, but in the > end it's up to you and I trust your decision. > > -Alex I though it worth chiming in that folks do use these simple dumb use cases. -- Philip
On Sun, Apr 21, 2019 at 02:38:38PM +0900, Junio C Hamano wrote: > Alex Henrie <alexhenrie24@gmail.com> writes: > > > A common workflow is to make a commit on a local branch, push the branch > > to the remote, check out the remote branch on a second computer, amend > > the commit on the second computer, force-push back to the remote branch, > > and finally submit a pull request. However, if the user switches back to > > the first computer, they must then run the cumbersome command > > `git fetch && git reset --hard origin`. > > Doesn't anybody sense there is a larger problem if such a workflow > is "common" in the first place? In that sequence, when you come > back to the first repository there is no guarantee that what you are > losing is exactly what you are willing to lose and nothing else > (i.e. your earlier WIP you sent to the second repository, which was > further polished, making the earlier WIP you had here irrelevant). It may be helpful to point out that this is essentially the workflow I had before I had only one computer. I would make some changes on my desktop, which was my primary computer, then need to travel somewhere and use my laptop. I would want to go back to my desktop when I returned home, because it was far more powerful, and I would know that I was the only user of this branch. Now, as a contributor and a moderately advanced user, I would likely end up using "git commit --fixup" (or "--squash") for this purpose and squash only before submitting, but there are situations where fixup commits cause conflicts and it's necessary to do a rebase and force push if you don't want extensive pain. So while I think that "git pull --rebase" or "git pull --ff-only" are probably better options in most situations, I can see the use in this command, with appropriate foresight and knowledge. I can also see how it's easy to blow away data with the proposed option, especially for novice users, who are likely not aware of how to restore it from the reflog. I'm not sure if this email is an argument for or against this option, but maybe it provides some helpful perspective.
"brian m. carlson" <sandals@crustytoothpaste.net> writes: > It may be helpful to point out that this is essentially the workflow I > had ... > I'm not sure if this email is an argument for or against this option, > but maybe it provides some helpful perspective. I think you and Phillip misread me. I did not question if the workflow is common the message you are responding to. I've done my fair share of "I know what I have on my laptop is stale as I pushed it out to elsewhere to continue working on it, so let's get rid of it from the updated upstream and start afresh" myself. What I questioned was if it is sensible to ensure that it stays common. We'd be encouraging the dangerous workflow, instead of analysing the situation where people employ it and trying to give them a better alternative to deal with the situation. That is what we'd be doing with "pull --reset", i.e. an easier to type short-hand that robs the chance to double check in the middle of "fetch && inspect && reset" sequence. As to where the feature should go, if we decide that it is a good idea to promote this workflow element as a supported feature, I agree with Alex's design in the patch that it makes sense to have it as "pull --reset". Existing "pull --merge" and "pull -rebase" are "fetch and then integrate by merging theirs into ours" and "fetch and then integrate by rebasing our stuff on top of theirs"; the new "pull --reset" would fit well sitting next to them with its "fetch and then treat our stuff as valueless and replace it with theirs" semantics.
On 22/04/2019 00:38, Junio C Hamano wrote: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> It may be helpful to point out that this is essentially the workflow I >> had ... >> I'm not sure if this email is an argument for or against this option, >> but maybe it provides some helpful perspective. > > I think you and Phillip misread me. > > I did not question if the workflow is common the message you are > responding to. I've done my fair share of "I know what I have on my > laptop is stale as I pushed it out to elsewhere to continue working > on it, so let's get rid of it from the updated upstream and start > afresh" myself. > > What I questioned was if it is sensible to ensure that it stays > common. > > We'd be encouraging the dangerous workflow, instead of analysing the > situation where people employ it and trying to give them a better > alternative to deal with the situation. I agree that it would be better to support this workflow in a safer manner. I use a script that looks at the reflog of the local and remote branches to work out if it is safe to reset --hard when pulling. If there are local changes since the last push/pull (because I forget to sync before starting work) then it will run rebase --onto <new-remote> <last-sync>. If the remote branch and local branches have both been rewritten it tells me to run range-diff to figure out how to fix it. The value of <last-sync> is the most recent push or pull that updated the local branch. i.e. when both the local and remote refs pointed to the same commit at the same time (rebase complicates this as the remote we rebased onto does not appear in directly the local reflog - you have to look for "rebase .*finish.* <local-ref> <remote-oid>". Also an ide might auto fetch and update the remote ref while we're rebasing the local work which stops the reflog coinciding exactly.) The <last-sync> is also useful for making forced pushes safer. The script pushes with --force-with-lease=<ref>:<last-sync> if the local branch has been rewritten since the last sync so that any remote changes since the last sync are not overwritten. Using the reflog to figure out the last sync makes forced pushes and pulls much safer. If we want to support this workflow in core git (and it does seem to be fairly common) then I think it needs to be based on something that prevents people from accidentally throwing away local changes when pulling and remote changes when pushing. There was some discussion of this with respect to pushing last summer[1] Best Wishes Phillip [1] https://public-inbox.org/git/CAEFop40OJ5MRwM8zxE44yB0f2Fxw9YsUdM1e-H=Nn9e=sAGJ=w@mail.gmail.com/T/#u That is what we'd be doing > with "pull --reset", i.e. an easier to type short-hand that robs the > chance to double check in the middle of "fetch && inspect && reset" > sequence. > > As to where the feature should go, if we decide that it is a good > idea to promote this workflow element as a supported feature, I > agree with Alex's design in the patch that it makes sense to have it > as "pull --reset". Existing "pull --merge" and "pull -rebase" are > "fetch and then integrate by merging theirs into ours" and "fetch > and then integrate by rebasing our stuff on top of theirs"; the new > "pull --reset" would fit well sitting next to them with its "fetch > and then treat our stuff as valueless and replace it with theirs" > semantics. >
diff --git a/Documentation/git-pull.txt b/Documentation/git-pull.txt index 118d9d86f7..032a5c2e34 100644 --- a/Documentation/git-pull.txt +++ b/Documentation/git-pull.txt @@ -23,6 +23,7 @@ More precisely, 'git pull' runs 'git fetch' with the given parameters and calls 'git merge' to merge the retrieved branch heads into the current branch. With `--rebase`, it runs 'git rebase' instead of 'git merge'. +With `--reset`, it runs `git reset --hard` instead of 'git merge'. <repository> should be the name of a remote repository as passed to linkgit:git-fetch[1]. <refspec> can name an @@ -141,6 +142,13 @@ unless you have read linkgit:git-rebase[1] carefully. + This option is only valid when "--rebase" is used. +--reset:: + Reset the local branch to be identical to the remote branch, discarding + any local commits or other changes. + +--no-reset:: + Override earlier --reset. + Options related to fetching ~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/builtin/pull.c b/builtin/pull.c index 33db889955..b32134c1f1 100644 --- a/builtin/pull.c +++ b/builtin/pull.c @@ -95,6 +95,7 @@ static int recurse_submodules = RECURSE_SUBMODULES_DEFAULT; /* Options passed to git-merge or git-rebase */ static enum rebase_type opt_rebase = -1; +static char *opt_reset; static char *opt_diffstat; static char *opt_log; static char *opt_signoff; @@ -144,6 +145,9 @@ static struct option pull_options[] = { "(false|true|merges|preserve|interactive)", N_("incorporate changes by rebasing rather than merging"), PARSE_OPT_OPTARG, parse_opt_rebase }, + OPT_PASSTHRU(0, "reset", &opt_reset, NULL, + N_("discard all local changes rather than merging"), + PARSE_OPT_NOARG | PARSE_OPT_NONEG), OPT_PASSTHRU('n', NULL, &opt_diffstat, NULL, N_("do not show a diffstat at the end of the merge"), PARSE_OPT_NOARG | PARSE_OPT_NONEG), @@ -860,6 +864,16 @@ static int run_rebase(const struct object_id *curr_head, return ret; } +/** + * Runs git-reset, returning its exit status. + */ +static int run_reset(void) +{ + static const char *argv[] = { "reset", "--hard", "FETCH_HEAD", NULL }; + + return run_command_v_opt(argv, RUN_GIT_CMD); +} + int cmd_pull(int argc, const char **argv, const char *prefix) { const char *repo, **refspecs; @@ -892,6 +906,9 @@ int cmd_pull(int argc, const char **argv, const char *prefix) if (get_oid("HEAD", &orig_head)) oidclr(&orig_head); + if (opt_rebase && opt_reset) + die(_("--rebase and --reset are mutually exclusive")); + if (!opt_rebase && opt_autostash != -1) die(_("--[no-]autostash option is only valid with --rebase.")); @@ -986,6 +1003,12 @@ int cmd_pull(int argc, const char **argv, const char *prefix) ret = rebase_submodules(); return ret; + } else if (opt_reset) { + int ret = run_reset(); + if (!ret && (recurse_submodules == RECURSE_SUBMODULES_ON || + recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND)) + ret = update_submodules(); + return ret; } else { int ret = run_merge(); if (!ret && (recurse_submodules == RECURSE_SUBMODULES_ON || diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh index cf4cc32fd0..597f2429d5 100755 --- a/t/t5520-pull.sh +++ b/t/t5520-pull.sh @@ -708,4 +708,28 @@ test_expect_success 'git pull --rebase against local branch' ' test file = "$(cat file2)" ' +test_expect_success 'git pull --rebase --reset is an invalid combination' ' + test_must_fail git pull --rebase --reset . +' + +test_expect_success 'git pull --reset overwrites or deletes all tracked files' ' + git init cloned && + ( + cd cloned && + echo committed > committed && + echo staged > staged && + echo untracked > file && + echo untracked > untracked && + git add committed && + git commit -m original && + git add staged && + git pull --reset .. && + test ! -f committed && + test ! -f staged && + test "$(cat untracked)" = "untracked" && + test "$(git status --porcelain)" = "?? untracked" + ) && + test_cmp file cloned/file +' + test_done
A common workflow is to make a commit on a local branch, push the branch to the remote, check out the remote branch on a second computer, amend the commit on the second computer, force-push back to the remote branch, and finally submit a pull request. However, if the user switches back to the first computer, they must then run the cumbersome command `git fetch && git reset --hard origin`. (Actually, at this point Git novices often try running `git pull --force`, but it doesn't do what they expect.) This patch adds the shortcut `git pull --reset` to serve as a complement to `git push --force`. Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> --- Documentation/git-pull.txt | 8 ++++++++ builtin/pull.c | 23 +++++++++++++++++++++++ t/t5520-pull.sh | 24 ++++++++++++++++++++++++ 3 files changed, 55 insertions(+)