diff mbox series

rebase: introduce --update-branches option

Message ID 20190902234109.2922-2-wh109@yahoo.com (mailing list archive)
State New, archived
Headers show
Series rebase: introduce --update-branches option | expand

Commit Message

Warren He Sept. 2, 2019, 11:41 p.m. UTC
Rebasing normally updates the current branch to the rewritten version.
If any other branches point to commits rewritten along the way, those
remain untouched. This commit adds an `--update-branches` option, which
instructs the command to update any such branches that it encounters to
point to the rewritten versions of those commits.

Signed-off-by: Warren He <wh109@yahoo.com>
---
 Documentation/git-rebase.txt      |  8 +++++
 builtin/rebase.c                  | 13 ++++++--
 sequencer.c                       | 68 ++++++++++++++++++++++++++++++++++++++-
 sequencer.h                       |  6 ++--
 t/t3431-rebase-update-branches.sh | 64 ++++++++++++++++++++++++++++++++++++
 5 files changed, 154 insertions(+), 5 deletions(-)
 create mode 100755 t/t3431-rebase-update-branches.sh

Comments

Johannes Schindelin Sept. 3, 2019, 1:26 p.m. UTC | #1
Hi Warren,

On Mon, 2 Sep 2019, Warren He wrote:

> Rebasing normally updates the current branch to the rewritten version.
> If any other branches point to commits rewritten along the way, those
> remain untouched. This commit adds an `--update-branches` option, which
> instructs the command to update any such branches that it encounters to
> point to the rewritten versions of those commits.

This is definitely a feature I longed for, myself. I worked around it by
using some complicated vim command (I think it was something like
`%s/^label (.*)/&^Mexec git update-ref refs/heads/\1 HEAD` or some such,
where the `^M` is actually typed via Ctrl+V Ctrl+M, and of course, this
does not take existing branches into account, but updates refs based on
the merge commits' onelines).

Once `--update-branches` is available, I will want to use it e.g. for my
work on the built-in `git add -i`, where I have to update 6
inter-dependent PRs in parallel.

A couple of suggestions below. (If you are like me, trying to address
all the suggestions while reading the code review, I would like to
suggest reading all the way through to the end first, it might save you
some work.)

> [...]
> diff --git a/builtin/rebase.c b/builtin/rebase.c
> index 670096c..cf87c53 100644
> --- a/builtin/rebase.c
> +++ b/builtin/rebase.c
> @@ -447,6 +449,8 @@ int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
>  		OPT_NEGBIT(0, "ff", &opts.flags, N_("allow fast-forward"),
>  			   REBASE_FORCE),
>  		OPT_BOOL(0, "keep-empty", &opts.keep_empty, N_("keep empty commits")),
> +		OPT_BOOL(0, "update-branches", &opts.update_branches,
> +			 N_("update branches that point to reapplied commits")),

Maybe `s/reapplied/rebased/`?

Also, seeing as the `cmd_rebase__interactive()` is only used by `git
rebase --preserve-merges`, which cannot make use of `--update-branches`
anyway, I would suggest to simply drop this hunk and only keep the next
one.

>  		OPT_BOOL(0, "allow-empty-message", &opts.allow_empty_message,
>  			 N_("allow commits with empty messages")),
>  		OPT_BOOL(0, "rebase-merges", &opts.rebase_merges, N_("rebase merge commits")),
> @@ -1453,6 +1457,8 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  		OPT_RERERE_AUTOUPDATE(&options.allow_rerere_autoupdate),
>  		OPT_BOOL('k', "keep-empty", &options.keep_empty,
>  			 N_("preserve empty commits during rebase")),
> +		OPT_BOOL(0, "update-branches", &options.update_branches,
> +			 N_("update branches that point to reapplied commits")),
>  		OPT_BOOL(0, "autosquash", &options.autosquash,
>  			 N_("move commits that begin with "
>  			    "squash!/fixup! under -i")),
> @@ -1710,6 +1716,9 @@ int cmd_rebase(int argc, const char **argv, const char *prefix)
>  	if (options.keep_empty)
>  		imply_interactive(&options, "--keep-empty");
>
> +	if (options.update_branches)
> +		imply_interactive(&options, "--update-branches");

It should not really imply `--interactive`, but `--rebase-merges`.
Imagine for example this call:

	git rebase --update-branches @{u}

This would not even be interactive (even if it uses the interactive
rebase backend to execute), but it would have to rebase the merge
commits. So I think we need something like this instead:

	if (options.update_branches && !options.rebase_merges) {
		if (options.type == REBASE_UNSPECIFIED)
			options.type = REBASE_INTERACTIVE;
		else if (options.type != REBASE_INTERACTIVE)
			die(_("%s requires an interactive rebase"),
			    "--update-branches");
		options.rebase_merges = 1;
	}

I don't think we can use `imply_interactive()` here because that would
also allow `--update-branches` to be combined with `--preserve-merges`
(which does not work, and as the latter is deprecated, I would advise
against trying to make it work, either).

> +
>  	if (gpg_sign) {
>  		free(options.gpg_sign_opt);
>  		options.gpg_sign_opt = xstrfmt("-S%s", gpg_sign);
> [...]
> diff --git a/sequencer.c b/sequencer.c
> index 34ebf8e..c6749ff 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -4901,6 +4901,69 @@ void todo_list_add_exec_commands(struct todo_list *todo_list,
>  	todo_list->alloc = alloc;
>  }
>
> +/*
> + * Add commands to update branch refs after the todo list would pick a commit
> + * that a branch ref points to.
> + */
> +static void todo_list_add_branch_updates(struct todo_list *todo_list,
> +				       const char *head_name)
> +{
> +	struct strbuf *buf = &todo_list->buf;
> +	int i, nr = 0, alloc = 0;
> +	struct todo_item *items = NULL;
> +
> +	load_ref_decorations(NULL, 0);

This loads also tags, correct? I am fairly certain that we don't want to
update tags here, but maybe the check for `DECORATION_REF_LOCAL` later
on already ensures that?

> +
> +	for (i = 0; i < todo_list->nr; i++) {
> +		const struct todo_item *item = &todo_list->items[i];
> +		enum todo_command command = item->command;
> +		const struct name_decoration *decoration;
> +
> +		ALLOC_GROW(items, nr + 1, alloc);
> +		items[nr++] = todo_list->items[i];
> +
> +		switch (command) {
> +		case TODO_PICK:
> +		case TODO_MERGE:
> +			break;
> +		default:
> +			continue;
> +		}

I think this misses the `edit` and `reword` commands. How about using
`is_pick_or_similar()` instead?

Of course, there is one slight problem with that: you would _also_ have
to call `is_fixup()` and delay the addition of the corresponding branch
updates (you don't want to apply multiple `fixup` commits and possibly
point a local branch to an already fixed-up commit).

> +
> +		decoration = get_name_decoration(&item->commit->object);
> +		for (; decoration; decoration = decoration->next) {
> +			size_t base_offset, pretty_name_len;
> +			const char *pretty_name;
> +
> +			if (decoration->type != DECORATION_REF_LOCAL)
> +				continue;
> +			if (!strcmp(decoration->name, head_name))
> +				// Rebase itself will update the current branch for us.

Please use C-style /* ... */ comments, Git insists on not using
C++-style // comments.

> +				continue;

My instant reaction was: just compare `&item->commit->object.oid` to
`head_hash`, but your code is careful to take care of the scenario where
multiple local branches point to the pre-rebase `HEAD`. Good. Maybe you
want to test for that in the regression test, too?

However, you have two `if` conditions that both guard the same
operation: `continue`. How about combining the combinations? It's like
saying: under these circumstances, we skip adding a command.

> +
> +			base_offset = buf->len;
> +			pretty_name = prettify_refname(decoration->name);
> +			pretty_name_len = strlen(pretty_name);
> +			strbuf_addstr(buf, "exec git branch -f ");
> +			strbuf_addstr(buf, pretty_name);
> +			strbuf_addch(buf, '\n');
> +
> +			ALLOC_GROW(items, nr + 1, alloc);
> +			items[nr++] = (struct todo_item) {
> +				.command = TODO_EXEC,
> +				.offset_in_buf = base_offset,
> +				.arg_offset = base_offset + strlen("exec "),
> +				.arg_len = strlen("git branch -f ") + pretty_name_len,
> +			};

Wow. This is the first time I see anybody trying to use this kind of
assignment: `<item> = (struct <name>) { ... }`. I don't really know that
we can expect all C compilers that are currently used to compile Git to
handle this kind of construct. So I'd rather be a little bit more
verbose and safe:

			memset(items[nr], 0, sizeof(items[nr]));
			items[nr].command = TODO_EXEC;
			[...]
			nr++;

Taking another step back, I think we should actually be super lazy here
(it's not performance-critical code, so we can save us some time
verifying correctness, at the cost of a little slower execution):

			size_t len = buf->len;
			strbuf_addf(buf, "exec git branch -f %s\n",
				    prettify_refname(decoration->name));
			parse_insn_line(r, items[nr], buf.buf,
					buf.buf + len, buf.buf + buf.len);

Come to think of it, I wonder whether you could simply use a full
`struct todo_list` instead of maintaining the `items` manually, and then
use `append_new_todo()`.

At the end, you would simply call

	SWAP(new_list.items, todo_list->items);
	SWAP(new_list.nr, todo_list->nr);
	SWAP(new_list.alloc, todo_list->alloc);
	todo_list_release(&new_list);

Now, let's talk about this command: `git branch -f <name>`. I _think_ it
is the right command to use (instead of `git update-ref`, for example),
yet I have a couple doubts about that:

- with `git update-ref <name> HEAD <old-commit>`, we could safe-guard
  against ref updates while the rebase is running. This is _not_
  possible with `git branch -f`... We would overwrite the branch, even
  if another Git process had updated it in the meantime, e.g. in a
  manually-inserted `exec` call.

- we definitely want to try our best to handle the case where the branch
  in question is currently checked out in a separate worktree. This is a
  use case that bit me in the past (with my complicated vim method of
  emulating `--update-branches`).

- If we use `git branch`, we _have_ to use the short branch name.
  However, `prettify_refname()` does not guarantee that, I don't think:
  it strips `refs/heads/` or `refs/tags/` or `refs/remotes/` _if found_.
  So I would feel a lot more comfortable with

  	if (!skip_prefix(decoration->name, "refs/heads/", &name))
		continue;

	[...]

	strbuf_addf(buf, "exec git branch -f %s\n", name);

- I actually do not know off-hand what the reflog message for the branch
  updates would be! Would it be prefixed by `rebase -i`? I think we
  should make sure that the code uses an informative reflog message, and
  definitely verify that behavior in the regression test.

- An `exec git branch -f <name>` will update that branch _immediately_.
  Even if the user runs into a road-block with the current rebase and
  decides to abandon it via `git rebase --abort`. A subsequent attempt
  at rebasing would _no longer be able_ to pick up on the local
  branches, as they would not have been rolled back.

This last point is worth dwelling on for a moment. The interactive
rebase goes out of its way to first switch to an unnamed branch, and
only update the current branch at the very end, when the rebase
succeeded. If the rebase does not succeed, i.e. if it is aborted or
quit, the current branch will not be updated. The interactive rebase is
therefore sort of "transactional": either it succeeds in its entirety,
or all changes are rolled back.

However, the way this patch is implemented, it would break that design,
because the local branches would not be updated at the very end, but
already while the transaction is still being processed, so to say.

Spontaneously, I can think of a few ways to address this:

1. Add a new todo command, or an option for the `label` command (`label
   --branch <name>`) where it would remember to update those refs upon
   successful completion, much like the file
   `<GIT-DIR>/rebase-merge/refs-to-delete` accumulates the
   `refs/rewritten/<label>` refs to delete at the end of the interactive
   rebase, and teach `todo_list_add_branch_updates()` to use it.

2. Maybe even better would be to add a `label <branch-name>` (if there
   is no such command already) instead of the `exec git branch -f`
   lines, and then add a slur of `exec git branch -f` lines at the end
   of the todo list that pick up those labels and run them through
   `update-ref` to update them. That way, if the user aborts the
   interactive rebase somewhere in the middle, the refs would not have
   been updated yet.

3. Another method would combine the approaches 1. & 2.: add an option to
   the `label` command, say, `--update-branch=<name>`, which would append
   the following line to `<GIT-DIR>/rebase-merge/branches-to-update`:

   	update <name> <old-hash>

   And then a single line would be appended to the end of the todo
   list:

   	exec git update-ref --stdin <"<GIT-DIR>/rebase-merge/branches-to-update"

   This would have the advantage of making things a bit more explicit,
   but it would break expectations when a user would add the
   `--update-branch=<name>` option manually and forgot to make sure that
   that `update-ref` line was appended to the end.

Of course, with all of those methods, you would still have to make sure
that the todo command would be inserted _after_ `fixup`/`squash` chains.

However, now that I think of it, the `fixup`/`squash` thing is actually
a pretty tricky thing to deal with in this context. Imagine this commit
graph (newest commit first):

	*   Merge branch 'fix-names'
	| \
	|  * fixup! Implement the `names` command
	|  * Document the `names` feature
	|  * Implement the `names` command
	* / Fiddle with the README.md
	o

The `fixup!` will be re-ordered before `todo_list_add_branch_updates()`
is called, yet we do not want to update the `fix-names` branch to point
to the fixed-up commit, but still to the tip of the branch!

This consideration brings me to my favorite idea so far:

4. If we were to extend the `label` command by the `--update-branch`
   option, it would be nice if we could reuse the actual branch name as
   label name. However, at this stage we no longer have access to the
   `label_oid()` function, as the todo list was generated a long time
   ago. And we would not be able to guarantee that the branch name was
   not already used as a label.

   So why not implement this feature as part of
   `make_script_with_merges()`?

   That way, we could have an initial run through the commits to see
   whether they are decorated with any local branch, and if they are,
   make sure that they get the label(s) they want.

   (The only tricky situation with that would be that you could have a
   local branch `refs/heads/<40-digit-hex>` that clashes with a `reset
   <40-digit-hex>`, which situation can arise under
   `no-rebase-cousins`.)

   The current design actually refuses to have multiple labels per OID,
   but that could be fixed easily, as there is nothing preventing us
   from declaring the `state.commit2label` to allow multiple values for
   the same key. We would simply have to record together with a label
   our wish to update a local branch under that name, e.g. by adding a
   field `unsigned update_branch:1;` to `struct label_entry`, and then
   adding the appropriate `label` commands.

   This method would also have the advantage of adding the `label`
   commands in the right location even in the `fixup!` case I
   illustrated above: the `label` command would be added before the
   `fixup` command gets reordered.

   And it would probably make more sense to _not_ require that `exec git
   update-ref [...]` line at the end of the todo list, but to perform
   the updates transparently similarly to how the `refs/rewritten/` refs
   are removed at the end of an interactive rebase.

> +		}
> +	}
> +
> +	FREE_AND_NULL(todo_list->items);
> +	todo_list->items = items;
> +	todo_list->nr = nr;
> +	todo_list->alloc = alloc;
> +}
> +
>  static void todo_list_to_strbuf(struct repository *r, struct todo_list *todo_list,
>  				struct strbuf *buf, int num, unsigned flags)
>  {
> [...]
> diff --git a/t/t3431-rebase-update-branches.sh b/t/t3431-rebase-update-branches.sh
> new file mode 100755
> index 0000000..221c25d
> --- /dev/null
> +++ b/t/t3431-rebase-update-branches.sh
> @@ -0,0 +1,64 @@
> +#!/bin/sh
> +
> +test_description='git rebase -i --update-branches

How about dropping that `-i`? I think this option should also work
without an explicit `--interactive`.

> +
> +This test runs git rebase, moving branch refs that point to commits
> +that are reapplied.
> +
> +Initial setup:
> +
> + A - B          (master)
> +  |\
> +  |  C          (linear-early)
> +  |    \
> +  |      D      (linear-late)
> +  |\
> +  |  E          (feat-e)
> +   \   \
> +     F  |       (feat-f)
> +       \|
> +         G      (interim)
> +           \
> +             H  (my-dev)
> +'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup linear' '
> +	test_commit A &&
> +	test_commit B &&
> +	git checkout -b linear-early A &&
> +	test_commit C &&
> +	git checkout -b linear-late &&
> +	test_commit D
> +'
> +
> +test_expect_success 'smoketest linear' '
> +	git rebase --update-branches master
> +'
> +
> +test_expect_success 'check linear' '
> +	git rev-parse linear-early:B.t

I'd like to suggest this command instead:

	test_cmp_rev linear-early C

The `C` tag was created automatically by `test_commit C`, and the
`test_cmp_rev` command has the advantage of helping diagnose regressions
much better than `git rev-parse` would.

> +'
> +
> +test_expect_success 'setup merge' '
> +	git checkout -b feat-e A &&
> +	test_commit E &&
> +	git checkout -b feat-f A &&
> +	test_commit F &&
> +	git checkout -b interim &&
> +	test_merge G feat-e &&
> +	git checkout -b my-dev &&
> +	test_commit H
> +'
> +
> +test_expect_success 'smoketest merge' '
> +	git rebase -r --update-branches master

Let's try to make this work without `-r`, too.

> +'
> +
> +test_expect_success 'check merge' '
> +	git rev-parse feat-e:B.t &&
> +	git rev-parse feat-f:B.t &&
> +	git rev-parse interim:B.t

Likewise, I would love to see `test_cmp_rev` calls here.

Thank you for working on this feature, this is exciting!

Ciao,
Johannes

> +'
> +
> +test_done
> --
> 2.7.4
>
>
Warren He Sept. 7, 2019, 11:44 p.m. UTC | #2
Everyone in this thread, thanks for your support and encouragement.

Johannes, thanks for reviewing.

> Maybe `s/reapplied/rebased/`?

Ok. I've changed most occurrences, except in Documentation/git-rebase.txt, where
the term 'reapplied' is already in use.

> drop this hunk and only keep the next one.

I didn't know that. (Actually, assume this for most of these responses.)
Dropped, thanks.

> It should not really imply `--interactive`, but `--rebase-merges`.

`imply_interactive` doesn't fully switch on `--interactive`, i.e., causing the
editor to open. It only selects the backend, which I think we're saying is the
right thing. I've dropped the `-i` from the test description.

And we don't really have to imply --rebase-merges, in case someone would prefer
to linearize things, which who knows? Running that non-rebase-merges command in
the example scenario from my original post should give something like this:

```
 A - B              (master)
       \
         F          (feat-f)
           \ 
             E      (feat-e)
               \
                 H  (my-dev)
```

So for now I haven't moved the implementation into `make_script_with_merges`.

> This loads also tags, correct? I am fairly certain that we don't want to
> update tags here, but maybe the check for `DECORATION_REF_LOCAL` later
> on already ensures that?

Right on both points. This isn't as efficient as possible, since we're wasting
the work of loading tags and remote refs. Currently I don't know if the
performance is worth the maintainability cost of replicating most of the
`load_ref_decorations` and `get_name_decoration` family of functions and global
variables though.

> How about using `is_pick_or_similar()` instead?

That's the function I need. Although I'm not aware of anything that generates
`edit` or `reword` commands before we'll call `todo_list_add_branch_updates`. I
ended up not needing further logic with `is_fixup`. See below for the new
handling of fixup chains.

> Please use C-style /* ... */ comments, Git insists on not using
> C++-style // comments.

Thanks for pointing that out. Changed.

> your code is careful to take care of the scenario where
> multiple local branches point to the pre-rebase `HEAD`. Good. Maybe you
> want to test for that in the regression test, too?

Ah, yes I do. It is added, with `my-hotfixes` = `HEAD` -> `my-dev` in the test.

> However, you have two `if` conditions that both guard the same
> operation: `continue`. How about combining the combinations? It's like
> saying: under these circumstances, we skip adding a command.

Ok. Combined.

> [several ways to simplify how we build todo items]

I had only looked at how `todo_list_add_exec_commands` works. Let's try doing it
with a full `struct todo_list` and `parse_insn_line`. Thanks for posting these
suggestions.

> [handling `fixup`/`squash` chains]

I've moved `todo_list_add_branch_updates` to run before
`todo_list_rearrange_squash`. The rearranging pulls fixups out, causing the
branch update to "fall" onto the items before, and reinserts them between a
commit and its branch update, casing them to be included in the updated branch.
which is my opinion of the right thing to do. I've added a test about this with
the following scenario:

```
 A - B  (master)
   \
     I - J - fixup! I                 (fixup-early)
		      \
			K - fixup! J  (fixup-late)
```

which results in the following todo list with `--autosquash`:

```
pick 9eadc32 I
fixup 265fa32 fixup! I
pick a0754fc J
fixup e7d1999 fixup! J
exec git branch -f fixup-early
pick c8bc4af K
```

> I'd like to suggest [`test_cmp_rev`] instead

I've updated the test to use `test_cmp_rev`. It's not with your suggested
invocation though. We don't update the `C` tag. I've referred to the rebased `C`
with `test_cmp_rev linear-early HEAD^` and similar for the other checks.

* * *

And then there's the discussion about using `exec git branch -f`. To summarize
the issues collected from the entire thread:

1. the changes aren't atomically applied at the end of the rebase
2. it fails when the branch is checked out in a worktree
3. it clobbers the branch if anything else updates it during the rebase
4. the way we prepare the unprefixed branch doesn't work right some exotic cases
5. the reflog message it leaves is uninformative

For #4, I think we've lucked out actually. The `load_ref_decorations` routine we
use determines that a ref is `DECORATION_REF_LOCAL` under the condition
`starts_with(refname, "refs/heads/")` (log-tree.c:114, add_ref_decoration), so
`prettify_refname` will find the prefix and skip it. But that's an invariant
maintained by two pieces of code pretty far away from each other.

For #5, for the convenience of readers, the reflog entry it leaves looks like this:

```
00873f2 feat-e@{0}: branch: Reset to HEAD
```

Not great.

I haven't made any changes to this yet, but I've thought about what I want. My
favorite so far is to add a new todo command that just does everything right. It
would make a temparary ref `refs/rewritten-heads/xxx` (or something), and update
`refs/heads/xxx` at the end.

I agree that requiring a separate update-ref step at the end of the todo list is
unfriendly. Manually putting in some branch update commands and then realizing
that they weren't applied would be extremely frustrating. I don't see the option
of using existing tools as the easiest-to-use solution.

I'm reluctant to combine this with the existing `label` command. So far it
sounds like we generally want to be more willing to skip branch updates while
performing the rebase, with aforementioned scenarios where something else
updates the branch before we do, or if the branch becomes checked out in a
worktree. We don't want to mess up the structure of a `rebase -r` as a result of
skipping some branch updates. I think it would be conceptually simpler and
implementation-wise less tricky if we didn't combine it with the `label` and
`reset` system.

Warren He (1):
  rebase: introduce --update-branches option

 Documentation/git-rebase.txt      |  9 ++++
 builtin/rebase.c                  | 11 ++++-
 sequencer.c                       | 58 +++++++++++++++++++++++-
 sequencer.h                       |  6 ++-
 t/t3431-rebase-update-branches.sh | 94 +++++++++++++++++++++++++++++++++++++++
 5 files changed, 173 insertions(+), 5 deletions(-)
 create mode 100755 t/t3431-rebase-update-branches.sh
Phillip Wood Sept. 9, 2019, 10:44 a.m. UTC | #3
Hi Warren

On 08/09/2019 00:44, Warren He wrote:
> Everyone in this thread, thanks for your support and encouragement.
> [...]
>> It should not really imply `--interactive`, but `--rebase-merges`.
> 
> `imply_interactive` doesn't fully switch on `--interactive`, i.e., causing the
> editor to open. It only selects the backend, which I think we're saying is the
> right thing. I've dropped the `-i` from the test description.
> 
> And we don't really have to imply --rebase-merges, in case someone would prefer
> to linearize things, which who knows? Running that non-rebase-merges command in
> the example scenario from my original post should give something like this:

I think it would probably be less confusing to default to preserving 
merges and having an option to turn that off - people are going to be 
surprised if their history is linearized.

> ```
>   A - B              (master)
>         \
>           F          (feat-f)
>             \
>               E      (feat-e)
>                 \
>                   H  (my-dev)
> ```
> 
> So for now I haven't moved the implementation into `make_script_with_merges`.
>
 > [...]>
>> [handling `fixup`/`squash` chains]
> 
> I've moved `todo_list_add_branch_updates` to run before
> `todo_list_rearrange_squash`. The rearranging pulls fixups out, causing the
> branch update to "fall" onto the items before, and reinserts them between a
> commit and its branch update, casing them to be included in the updated branch.
> which is my opinion of the right thing to do. I've added a test about this with
> the following scenario:
> 
> ```
>   A - B  (master)
>     \
>       I - J - fixup! I                 (fixup-early)
> 		      \
> 			K - fixup! J  (fixup-late)
> ```
> 
> which results in the following todo list with `--autosquash`:
> 
> ```
> pick 9eadc32 I
> fixup 265fa32 fixup! I
> pick a0754fc J
> fixup e7d1999 fixup! J
> exec git branch -f fixup-early
> pick c8bc4af K
> ```

That makes sense I think

>> I'd like to suggest [`test_cmp_rev`] instead
> 
> I've updated the test to use `test_cmp_rev`. It's not with your suggested
> invocation though. We don't update the `C` tag. I've referred to the rebased `C`
> with `test_cmp_rev linear-early HEAD^` and similar for the other checks.

That sounds right

> * * *
> 
> And then there's the discussion about using `exec git branch -f`. To summarize
> the issues collected from the entire thread:
> 
> 1. the changes aren't atomically applied at the end of the rebase
> 2. it fails when the branch is checked out in a worktree
> 3. it clobbers the branch if anything else updates it during the rebase
> 4. the way we prepare the unprefixed branch doesn't work right some exotic cases
> 5. the reflog message it leaves is uninformative
> 
> For #4, I think we've lucked out actually. The `load_ref_decorations` routine we
> use determines that a ref is `DECORATION_REF_LOCAL` under the condition
> `starts_with(refname, "refs/heads/")` (log-tree.c:114, add_ref_decoration), so
> `prettify_refname` will find the prefix and skip it. But that's an invariant
> maintained by two pieces of code pretty far away from each other.
> 
> For #5, for the convenience of readers, the reflog entry it leaves looks like this:
> 
> ```
> 00873f2 feat-e@{0}: branch: Reset to HEAD
> ```
> 
> Not great.
> 
> I haven't made any changes to this yet, but I've thought about what I want. My
> favorite so far is to add a new todo command that just does everything right. It
> would make a temparary ref `refs/rewritten-heads/xxx` (or something), and update
> `refs/heads/xxx` at the end.

I think that's the best way to do it. If we had a command like 'branch 
<branch-name>' that creates a ref to remember the current HEAD and saves 
the current branch head. Then at the end rebase can update the branches 
to point to the saved commits if the branch is unchanged. If the rebase 
is aborted then we don't end up with some branches updated and others not.

Side Note
   I'd avoid creating another worktree local ref refs/rewritten-heads/.
   Either store them under refs/rewritten/ or refs/worktree/

Best Wishes

Phillip

> 
> I agree that requiring a separate update-ref step at the end of the todo list is
> unfriendly. Manually putting in some branch update commands and then realizing
> that they weren't applied would be extremely frustrating. I don't see the option
> of using existing tools as the easiest-to-use solution.
> 
> I'm reluctant to combine this with the existing `label` command. So far it
> sounds like we generally want to be more willing to skip branch updates while
> performing the rebase, with aforementioned scenarios where something else
> updates the branch before we do, or if the branch becomes checked out in a
> worktree. We don't want to mess up the structure of a `rebase -r` as a result of
> skipping some branch updates. I think it would be conceptually simpler and
> implementation-wise less tricky if we didn't combine it with the `label` and
> `reset` system.
> 
> Warren He (1):
>    rebase: introduce --update-branches option
> 
>   Documentation/git-rebase.txt      |  9 ++++
>   builtin/rebase.c                  | 11 ++++-
>   sequencer.c                       | 58 +++++++++++++++++++++++-
>   sequencer.h                       |  6 ++-
>   t/t3431-rebase-update-branches.sh | 94 +++++++++++++++++++++++++++++++++++++++
>   5 files changed, 173 insertions(+), 5 deletions(-)
>   create mode 100755 t/t3431-rebase-update-branches.sh
>
Johannes Schindelin Sept. 9, 2019, 2:13 p.m. UTC | #4
Hi,

On Mon, 9 Sep 2019, Phillip Wood wrote:

> On 08/09/2019 00:44, Warren He wrote:
> > Everyone in this thread, thanks for your support and encouragement.
> > [...]
> > > It should not really imply `--interactive`, but `--rebase-merges`.
> >
> > `imply_interactive` doesn't fully switch on `--interactive`, i.e., causing
> > the
> > editor to open. It only selects the backend, which I think we're saying is
> > the
> > right thing. I've dropped the `-i` from the test description.
> >
> > And we don't really have to imply --rebase-merges, in case someone would
> > prefer
> > to linearize things, which who knows? Running that non-rebase-merges command
> > in
> > the example scenario from my original post should give something like this:
>
> I think it would probably be less confusing to default to preserving merges

s/preserving/rebasing/?

> and having an option to turn that off - people are going to be surprised if
> their history is linearized.

I don't think it makes any sense to linearize the history while updating
branches, as the commits will be all jumbled up. Imagine this history:

	- A - B - C - D -
	    \       /
	      E - F

Typically, it does not elicit any "bug" reports, but this can easily be
linearized to

	- A' - B' - E' - C' - F' -

In my mind, it makes no sense to update any local branches that pointed
to C and F to point to C' and F', respectively.

> [...]
> > * * *
> >
> > And then there's the discussion about using `exec git branch -f`. To
> > summarize
> > the issues collected from the entire thread:
> >
> > 1. the changes aren't atomically applied at the end of the rebase
> > 2. it fails when the branch is checked out in a worktree
> > 3. it clobbers the branch if anything else updates it during the rebase
> > 4. the way we prepare the unprefixed branch doesn't work right some exotic
> > cases
> > 5. the reflog message it leaves is uninformative
> >
> > For #4, I think we've lucked out actually. The `load_ref_decorations`
> > routine we
> > use determines that a ref is `DECORATION_REF_LOCAL` under the condition
> > `starts_with(refname, "refs/heads/")` (log-tree.c:114, add_ref_decoration),
> > so
> > `prettify_refname` will find the prefix and skip it. But that's an invariant
> > maintained by two pieces of code pretty far away from each other.
> >
> > For #5, for the convenience of readers, the reflog entry it leaves looks
> > like this:
> >
> > ```
> > 00873f2 feat-e@{0}: branch: Reset to HEAD
> > ```
> >
> > Not great.
> >
> > I haven't made any changes to this yet, but I've thought about what I want.
> > My
> > favorite so far is to add a new todo command that just does everything
> > right. It
> > would make a temparary ref `refs/rewritten-heads/xxx` (or something), and
> > update
> > `refs/heads/xxx` at the end.
>
> I think that's the best way to do it. If we had a command like 'branch
> <branch-name>' that creates a ref to remember the current HEAD and saves the
> current branch head. Then at the end rebase can update the branches to point
> to the saved commits if the branch is unchanged. If the rebase is aborted then
> we don't end up with some branches updated and others not.

I'd avoid cluttering the space with more commands. For `branch`, for
example, the natural short command would be `b`, but that already means
`break`.

In contrast, I would think that

	label --update-branch my-side-track

would make for a nicer read than

	label my-side-track
	branch my-side-track

Of course, it would be a lot harder to bring back the safety of `git
update-ref`'s `<old-revision>` safe-guard, in either forms.

And of course, the first form would _require_ the logic to be moved to
`make_script_with_merges()` because we could not otherwise guarantee
that the labels corresponding to local branch names aren't already in
use, for different commits.

> Side Note
>   I'd avoid creating another worktree local ref refs/rewritten-heads/.
>   Either store them under refs/rewritten/ or refs/worktree/

Yep.

Ciao,
Dscho
Junio C Hamano Sept. 9, 2019, 6:11 p.m. UTC | #5
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> In contrast, I would think that
>
> 	label --update-branch my-side-track
>
> would make for a nicer read than
>
> 	label my-side-track
> 	branch my-side-track

Because labelling while recreating a mergey history is conceptually
the same as temporarily updating the branch head from end users'
point of view, so this sounds quite sensible.
Phillip Wood Sept. 23, 2019, 9:40 a.m. UTC | #6
Hi

On 09/09/2019 15:13, Johannes Schindelin wrote:
> Hi,
> 
> On Mon, 9 Sep 2019, Phillip Wood wrote:
> 
>> On 08/09/2019 00:44, Warren He wrote:
>>> Everyone in this thread, thanks for your support and encouragement.
>>> [...]
>>>> It should not really imply `--interactive`, but `--rebase-merges`.
>>>
>>> `imply_interactive` doesn't fully switch on `--interactive`, i.e., causing
>>> the
>>> editor to open. It only selects the backend, which I think we're saying is
>>> the
>>> right thing. I've dropped the `-i` from the test description.
>>>
>>> And we don't really have to imply --rebase-merges, in case someone would
>>> prefer
>>> to linearize things, which who knows? Running that non-rebase-merges command
>>> in
>>> the example scenario from my original post should give something like this:
>>
>> I think it would probably be less confusing to default to preserving merges
> 
> s/preserving/rebasing/?
> 
>> and having an option to turn that off - people are going to be surprised if
>> their history is linearized.
> 
> I don't think it makes any sense to linearize the history while updating
> branches, as the commits will be all jumbled up. Imagine this history:
> 
> 	- A - B - C - D -
> 	    \       /
> 	      E - F
> 
> Typically, it does not elicit any "bug" reports, but this can easily be
> linearized to
> 
> 	- A' - B' - E' - C' - F' -
> 
> In my mind, it makes no sense to update any local branches that pointed
> to C and F to point to C' and F', respectively.
> 

I agree

>> [...]
>>> * * *
>>>
>>> And then there's the discussion about using `exec git branch -f`. To
>>> summarize
>>> the issues collected from the entire thread:
>>>
>>> 1. the changes aren't atomically applied at the end of the rebase
>>> 2. it fails when the branch is checked out in a worktree
>>> 3. it clobbers the branch if anything else updates it during the rebase
>>> 4. the way we prepare the unprefixed branch doesn't work right some exotic
>>> cases
>>> 5. the reflog message it leaves is uninformative
>>>
>>> For #4, I think we've lucked out actually. The `load_ref_decorations`
>>> routine we
>>> use determines that a ref is `DECORATION_REF_LOCAL` under the condition
>>> `starts_with(refname, "refs/heads/")` (log-tree.c:114, add_ref_decoration),
>>> so
>>> `prettify_refname` will find the prefix and skip it. But that's an invariant
>>> maintained by two pieces of code pretty far away from each other.
>>>
>>> For #5, for the convenience of readers, the reflog entry it leaves looks
>>> like this:
>>>
>>> ```
>>> 00873f2 feat-e@{0}: branch: Reset to HEAD
>>> ```
>>>
>>> Not great.
>>>
>>> I haven't made any changes to this yet, but I've thought about what I want.
>>> My
>>> favorite so far is to add a new todo command that just does everything
>>> right. It
>>> would make a temparary ref `refs/rewritten-heads/xxx` (or something), and
>>> update
>>> `refs/heads/xxx` at the end.
>>
>> I think that's the best way to do it. If we had a command like 'branch
>> <branch-name>' that creates a ref to remember the current HEAD and saves the
>> current branch head. Then at the end rebase can update the branches to point
>> to the saved commits if the branch is unchanged. If the rebase is aborted then
>> we don't end up with some branches updated and others not.
> 
> I'd avoid cluttering the space with more commands. For `branch`, for
> example, the natural short command would be `b`, but that already means
> `break`.

We could just not have a short name, after all --update-branch is hardly 
a short alternative

> 
> In contrast, I would think that
> 
> 	label --update-branch my-side-track
> 
> would make for a nicer read than
> 
> 	label my-side-track
> 	branch my-side-track

I agree it would be nice to do both on a single line, my argument was 
mainly against using 'exec branch ...' so that we can defer the branch 
updates until the very end of the rebase. The branch command could set a 
label as well or we could add an option to label I'm not that bothered 
either way at the moment. Another possibility which we probably don't 
want is to have labels starting refs/ imply --update-branch

> Of course, it would be a lot harder to bring back the safety of `git
> update-ref`'s `<old-revision>` safe-guard, in either forms.

Is it that difficult to write the current branch HEAD to a file when we 
label it and then read those back at the end when we update the refs? or 
are you thinking of calling 'git branch' instead of 
ref_transaction_update()? I'm not sure what the advantage of 'git 
branch' is though.

Best Wishes

Phillip

> 
> And of course, the first form would _require_ the logic to be moved to
> `make_script_with_merges()` because we could not otherwise guarantee
> that the labels corresponding to local branch names aren't already in
> use, for different commits.
> 
>> Side Note
>>    I'd avoid creating another worktree local ref refs/rewritten-heads/.
>>    Either store them under refs/rewritten/ or refs/worktree/
> 
> Yep.
> 
> Ciao,
> Dscho
>
diff mbox series

Patch

diff --git a/Documentation/git-rebase.txt b/Documentation/git-rebase.txt
index 6156609..c37a0db 100644
--- a/Documentation/git-rebase.txt
+++ b/Documentation/git-rebase.txt
@@ -246,6 +246,13 @@  leave out at most one of A and B, in which case it defaults to HEAD.
 +
 See also INCOMPATIBLE OPTIONS below.
 
+--update-branches::
+	If there are branch refs that point to commits that will be
+	reapplied, add shell commands to the todo list to update those
+	refs to point to the commits in the final history.
++
+See also INCOMPATIBLE OPTIONS below.
+
 --allow-empty-message::
 	By default, rebasing commits with an empty message will fail.
 	This option overrides that behavior, allowing commits with empty
@@ -535,6 +542,7 @@  are incompatible with the following options:
  * --interactive
  * --exec
  * --keep-empty
+ * --update-branches
  * --edit-todo
  * --root when used in combination with --onto
 
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 670096c..cf87c53 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -78,6 +78,7 @@  struct rebase_options {
 	int signoff;
 	int allow_rerere_autoupdate;
 	int keep_empty;
+	int update_branches;
 	int autosquash;
 	char *gpg_sign_opt;
 	int autostash;
@@ -349,8 +350,8 @@  static int do_interactive_rebase(struct rebase_options *opts, unsigned flags)
 
 		split_exec_commands(opts->cmd, &commands);
 		ret = complete_action(the_repository, &replay, flags,
-			shortrevisions, opts->onto_name, opts->onto, head_hash,
-			&commands, opts->autosquash, &todo_list);
+			shortrevisions, opts->onto_name, opts->onto, opts->head_name,
+			head_hash, &commands, opts->autosquash, &todo_list);
 	}
 
 	string_list_clear(&commands, 0);
@@ -375,6 +376,7 @@  static int run_rebase_interactive(struct rebase_options *opts,
 	flags |= opts->rebase_merges ? TODO_LIST_REBASE_MERGES : 0;
 	flags |= opts->rebase_cousins > 0 ? TODO_LIST_REBASE_COUSINS : 0;
 	flags |= command == ACTION_SHORTEN_OIDS ? TODO_LIST_SHORTEN_IDS : 0;
+	flags |= opts->update_branches ? TODO_LIST_UPDATE_BRANCHES : 0;
 
 	switch (command) {
 	case ACTION_NONE: {
@@ -447,6 +449,8 @@  int cmd_rebase__interactive(int argc, const char **argv, const char *prefix)
 		OPT_NEGBIT(0, "ff", &opts.flags, N_("allow fast-forward"),
 			   REBASE_FORCE),
 		OPT_BOOL(0, "keep-empty", &opts.keep_empty, N_("keep empty commits")),
+		OPT_BOOL(0, "update-branches", &opts.update_branches,
+			 N_("update branches that point to reapplied commits")),
 		OPT_BOOL(0, "allow-empty-message", &opts.allow_empty_message,
 			 N_("allow commits with empty messages")),
 		OPT_BOOL(0, "rebase-merges", &opts.rebase_merges, N_("rebase merge commits")),
@@ -1453,6 +1457,8 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 		OPT_RERERE_AUTOUPDATE(&options.allow_rerere_autoupdate),
 		OPT_BOOL('k', "keep-empty", &options.keep_empty,
 			 N_("preserve empty commits during rebase")),
+		OPT_BOOL(0, "update-branches", &options.update_branches,
+			 N_("update branches that point to reapplied commits")),
 		OPT_BOOL(0, "autosquash", &options.autosquash,
 			 N_("move commits that begin with "
 			    "squash!/fixup! under -i")),
@@ -1710,6 +1716,9 @@  int cmd_rebase(int argc, const char **argv, const char *prefix)
 	if (options.keep_empty)
 		imply_interactive(&options, "--keep-empty");
 
+	if (options.update_branches)
+		imply_interactive(&options, "--update-branches");
+
 	if (gpg_sign) {
 		free(options.gpg_sign_opt);
 		options.gpg_sign_opt = xstrfmt("-S%s", gpg_sign);
diff --git a/sequencer.c b/sequencer.c
index 34ebf8e..c6749ff 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -4901,6 +4901,69 @@  void todo_list_add_exec_commands(struct todo_list *todo_list,
 	todo_list->alloc = alloc;
 }
 
+/*
+ * Add commands to update branch refs after the todo list would pick a commit
+ * that a branch ref points to.
+ */
+static void todo_list_add_branch_updates(struct todo_list *todo_list,
+				       const char *head_name)
+{
+	struct strbuf *buf = &todo_list->buf;
+	int i, nr = 0, alloc = 0;
+	struct todo_item *items = NULL;
+
+	load_ref_decorations(NULL, 0);
+
+	for (i = 0; i < todo_list->nr; i++) {
+		const struct todo_item *item = &todo_list->items[i];
+		enum todo_command command = item->command;
+		const struct name_decoration *decoration;
+
+		ALLOC_GROW(items, nr + 1, alloc);
+		items[nr++] = todo_list->items[i];
+
+		switch (command) {
+		case TODO_PICK:
+		case TODO_MERGE:
+			break;
+		default:
+			continue;
+		}
+
+		decoration = get_name_decoration(&item->commit->object);
+		for (; decoration; decoration = decoration->next) {
+			size_t base_offset, pretty_name_len;
+			const char *pretty_name;
+
+			if (decoration->type != DECORATION_REF_LOCAL)
+				continue;
+			if (!strcmp(decoration->name, head_name))
+				// Rebase itself will update the current branch for us.
+				continue;
+
+			base_offset = buf->len;
+			pretty_name = prettify_refname(decoration->name);
+			pretty_name_len = strlen(pretty_name);
+			strbuf_addstr(buf, "exec git branch -f ");
+			strbuf_addstr(buf, pretty_name);
+			strbuf_addch(buf, '\n');
+
+			ALLOC_GROW(items, nr + 1, alloc);
+			items[nr++] = (struct todo_item) {
+				.command = TODO_EXEC,
+				.offset_in_buf = base_offset,
+				.arg_offset = base_offset + strlen("exec "),
+				.arg_len = strlen("git branch -f ") + pretty_name_len,
+			};
+		}
+	}
+
+	FREE_AND_NULL(todo_list->items);
+	todo_list->items = items;
+	todo_list->nr = nr;
+	todo_list->alloc = alloc;
+}
+
 static void todo_list_to_strbuf(struct repository *r, struct todo_list *todo_list,
 				struct strbuf *buf, int num, unsigned flags)
 {
@@ -5051,7 +5114,7 @@  static int skip_unnecessary_picks(struct repository *r,
 
 int complete_action(struct repository *r, struct replay_opts *opts, unsigned flags,
 		    const char *shortrevisions, const char *onto_name,
-		    struct commit *onto, const char *orig_head,
+		    struct commit *onto, const char *head_name, const char *orig_head,
 		    struct string_list *commands, unsigned autosquash,
 		    struct todo_list *todo_list)
 {
@@ -5076,6 +5139,9 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 	if (commands->nr)
 		todo_list_add_exec_commands(todo_list, commands);
 
+	if (flags & TODO_LIST_UPDATE_BRANCHES)
+		todo_list_add_branch_updates(todo_list, head_name);
+
 	if (count_commands(todo_list) == 0) {
 		apply_autostash(opts);
 		sequencer_remove_state(opts);
diff --git a/sequencer.h b/sequencer.h
index 6704acb..69c6f71 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -143,6 +143,7 @@  int sequencer_remove_state(struct replay_opts *opts);
  */
 #define TODO_LIST_REBASE_COUSINS (1U << 4)
 #define TODO_LIST_APPEND_TODO_HELP (1U << 5)
+#define TODO_LIST_UPDATE_BRANCHES (1U << 6)
 
 int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
 			  const char **argv, unsigned flags);
@@ -152,8 +153,9 @@  void todo_list_add_exec_commands(struct todo_list *todo_list,
 int check_todo_list_from_file(struct repository *r);
 int complete_action(struct repository *r, struct replay_opts *opts, unsigned flags,
 		    const char *shortrevisions, const char *onto_name,
-		    struct commit *onto, const char *orig_head, struct string_list *commands,
-		    unsigned autosquash, struct todo_list *todo_list);
+		    struct commit *onto, const char *head_name, const char *orig_head,
+		    struct string_list *commands, unsigned autosquash,
+		    struct todo_list *todo_list);
 int todo_list_rearrange_squash(struct todo_list *todo_list);
 
 /*
diff --git a/t/t3431-rebase-update-branches.sh b/t/t3431-rebase-update-branches.sh
new file mode 100755
index 0000000..221c25d
--- /dev/null
+++ b/t/t3431-rebase-update-branches.sh
@@ -0,0 +1,64 @@ 
+#!/bin/sh
+
+test_description='git rebase -i --update-branches
+
+This test runs git rebase, moving branch refs that point to commits
+that are reapplied.
+
+Initial setup:
+
+ A - B          (master)
+  |\
+  |  C          (linear-early)
+  |    \
+  |      D      (linear-late)
+  |\
+  |  E          (feat-e)
+   \   \
+     F  |       (feat-f)
+       \|
+         G      (interim)
+           \
+             H  (my-dev)
+'
+. ./test-lib.sh
+
+test_expect_success 'setup linear' '
+	test_commit A &&
+	test_commit B &&
+	git checkout -b linear-early A &&
+	test_commit C &&
+	git checkout -b linear-late &&
+	test_commit D
+'
+
+test_expect_success 'smoketest linear' '
+	git rebase --update-branches master
+'
+
+test_expect_success 'check linear' '
+	git rev-parse linear-early:B.t
+'
+
+test_expect_success 'setup merge' '
+	git checkout -b feat-e A &&
+	test_commit E &&
+	git checkout -b feat-f A &&
+	test_commit F &&
+	git checkout -b interim &&
+	test_merge G feat-e &&
+	git checkout -b my-dev &&
+	test_commit H
+'
+
+test_expect_success 'smoketest merge' '
+	git rebase -r --update-branches master
+'
+
+test_expect_success 'check merge' '
+	git rev-parse feat-e:B.t &&
+	git rev-parse feat-f:B.t &&
+	git rev-parse interim:B.t
+'
+
+test_done