[v2,0/7] rebase: update branches in multi-part topic

Message ID	pull.1247.v2.git.1654634569.gitgitgadget@gmail.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <pull.1247.v2.git.1654634569.gitgitgadget@gmail.com> In-Reply-To: <pull.1247.git.1654263472.gitgitgadget@gmail.com> References: <pull.1247.git.1654263472.gitgitgadget@gmail.com> From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> Date: Tue, 07 Jun 2022 20:42:42 +0000 Subject: [PATCH v2 0/7] rebase: update branches in multi-part topic Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, me@ttaylorr.com, Jeff Hostetler <git@jeffhostetler.com>, Phillip Wood <phillip.wood123@gmail.com>, Elijah Newren <newren@gmail.com>, Derrick Stolee <derrickstolee@github.com> Precedence: bulk
Series	rebase: update branches in multi-part topic \| expand [v2,0/7] rebase: update branches in multi-part topic [v2,1/7] log-tree: create for_each_decoration() [v2,2/7] branch: add branch_checked_out() helper [v2,3/7] sequencer: define array with enum values [v2,4/7] sequencer: add update-refs command [v2,5/7] rebase: add --update-refs option [v2,6/7] sequencer: implement 'update-refs' command [v2,7/7] rebase: add rebase.updateRefs config option

Jean-Noël Avila via GitGitGadget June 7, 2022, 8:42 p.m. UTC

This is a feature I've wanted for quite a while. When working on the sparse
index topic, I created a long RFC that actually broke into three topics for
full review upstream. These topics were sequential, so any feedback on an
earlier one required updates to the later ones. I would work on the full
feature and use interactive rebase to update the full list of commits.
However, I would need to update the branches pointing to those sub-topics.

This series adds a new --update-refs option to 'git rebase' (along with a
rebase.updateRefs config option) that adds 'label' and 'update-refs'
commands into the TODO list. This is powered by the commit decoration
machinery.

As an example, here is my in-progress bundle URI RFC split into subtopics as
they appear during the TODO list of a git rebase -i --update-refs:

pick 2d966282ff3 docs: document bundle URI standard
pick 31396e9171a remote-curl: add 'get' capability
pick 54c6ab70f67 bundle-uri: create basic file-copy logic
pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
pick 6adaf842684 fetch: add --bundle-uri option
pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
label for-update-refs/refs/heads/bundle-redo/fetch

pick 1e3f6546632 clone: add --bundle-uri option
pick 9e4a6fe9b68 clone: --bundle-uri cannot be combined with --depth
label for-update-refs/refs/heads/bundle-redo/clone

pick 5451cb6599c bundle-uri: create bundle_list struct and helpers
pick 3029c3aca15 bundle-uri: create base key-value pair parsing
pick a8b2de79ce8 bundle-uri: create "key=value" line parsing
pick 92625a47673 bundle-uri: unit test "key=value" parsing
pick a8616af4dc2 bundle-uri: limit recursion depth for bundle lists
pick 9d6809a8d53 bundle-uri: parse bundle list in config format
pick 287a732b54c bundle-uri: fetch a list of bundles
label for-update-refs/refs/heads/bundle-redo/list

pick b09f8226185 protocol v2: add server-side "bundle-uri" skeleton
pick 520204dcd1c bundle-uri client: add minimal NOOP client
pick 62e8b457b48 bundle-uri client: add "git ls-remote-bundle-uri"
pick 00eae925043 bundle-uri: serve URI advertisement from bundle.* config
pick 4277440a250 bundle-uri client: add boolean transfer.bundleURI setting
pick caf4599a81d bundle-uri: allow relative URLs in bundle lists
pick df255000b7e bundle-uri: download bundles from an advertised list
pick d71beabf199 clone: unbundle the advertised bundles
pick c9578391976 t5601: basic bundle URI tests
# Ref refs/heads/bundle-redo/rfc-3 checked out at '/home/stolee/_git/git-bundles'

label for-update-refs/refs/heads/bundle-redo/advertise

update-refs


 * Patches 1-2 are basic refactors that we will need for our implementation.
 * Patch 3 is a cleanup of an array with indexes given by the todo_command
   enum.
 * Patches 4-6 implement the feature incrementally. For instance, we can
   test the edits to the TODO file in patch 5, but we don't actually do
   anything when seeing the 'update-refs' command until patch 6.
 * Patch 7 adds the rebase.updateRefs config option similar to
   rebase.autoSquash.


Updates in v2
=============

As recommended by the excellent feedback, I have removed the 'exec' commands
in favor of the 'label' commands and a new 'update-refs' command at the very
end. This way, there is only one step that updates all of the refs at the
end instead of updating refs during the rebase. If a user runs 'git rebase
--abort' in the middle, then their refs are still where they need to be.

Based on some of the discussion, it seemed like one way to do this would be
to have an 'update-ref ' command that would take the place of these 'label'
commands. However, this would require two things that make it a bit awkward:

 1. We would need to replicate the storage of those positions during the
    rebase. 'label' already does this pretty well. I've added the
    "for-update-refs/" label to help here.
 2. If we want to close out all of the refs as the rebase is finishing, then
    that "step" becomes invisible to the user (and a bit more complicated to
    insert). Thus, the 'update-refs' step performs this action. If the user
    wants to do things after that step, then they can do so by editing the
    TODO list.

Other updates:

 * The 'keep_decorations' parameter was renamed to 'update_refs'.
 * I added tests for --rebase-merges=rebase-cousins to show how these labels
   interact with other labels and merge commands.
 * I changed the order of the insertion of these update-refs labels to be
   before the fixups are rearranged. This fixes a bug where the tip commit
   is a fixup! so its decorations are never inspected (and they would be in
   the wrong place even if they were). The fixup! commands are properly
   inserted between a pick and its following label command. Tests
   demonstrate this is correct.
 * Numerous style choices are updated based on feedback.

Thank you for all of the detailed review and ideas in this space. I
appreciate any more ideas that can make this feature as effective as it can
be.

Thanks, -Stolee

Derrick Stolee (7):
  log-tree: create for_each_decoration()
  branch: add branch_checked_out() helper
  sequencer: define array with enum values
  sequencer: add update-refs command
  rebase: add --update-refs option
  sequencer: implement 'update-refs' command
  rebase: add rebase.updateRefs config option

 Documentation/config/rebase.txt |   3 +
 Documentation/git-rebase.txt    |  12 ++
 branch.c                        |  24 ++--
 branch.h                        |   8 ++
 builtin/rebase.c                |  10 ++
 log-tree.c                      | 111 +++++++++++------
 log-tree.h                      |   4 +
 sequencer.c                     | 206 +++++++++++++++++++++++++++++---
 sequencer.h                     |   2 +
 t/t3404-rebase-interactive.sh   | 110 +++++++++++++++++
 10 files changed, 432 insertions(+), 58 deletions(-)


base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1247%2Fderrickstolee%2Frebase-keep-decorations-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1247/derrickstolee/rebase-keep-decorations-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1247

Range-diff vs v1:

 1:  4f9f3487641 = 1:  4f9f3487641 log-tree: create for_each_decoration()
 2:  5f54766e103 = 2:  5f54766e103 branch: add branch_checked_out() helper
 -:  ----------- > 3:  9f261c7df2c sequencer: define array with enum values
 -:  ----------- > 4:  842b2186d25 sequencer: add update-refs command
 3:  23fa6638864 ! 5:  0a4c110127b rebase: add --update-refs option
     @@ Commit message
          reachable from those "sub branches". It can take a manual step to update
          those branches.
      
     -    Add a new --update-refs option to 'git rebase -i' that adds 'git
     -    update-ref' exec steps to the todo file whenever a commit that is being
     -    rebased is decorated with that <ref>. This allows the user to rebase a
     -    long list of commits in a multi-part feature and keep all of their
     -    pointers to those parts.
     +    Add a new --update-refs option to 'git rebase -i' that adds 'label
     +    for-update-refs/*' steps to the todo file whenever a commit that is
     +    being rebased is decorated with that <ref>. At the very end, the
     +    'update-refs' step is added to update all of the branches referenced by
     +    the 'label' steps. This allows the user to rebase a long list of commits
     +    in a multi-part feature and keep all of their pointers to those parts.
     +
     +    NOTE: This change only introduce the --update-refs option and implements
     +    the changes to the todo file. It does _not_ yet implement the action
     +    taken by the 'update-refs' todo step, which will be implemented and
     +    tested in a later change.
      
          Use the new for_each_decoration() while iterating over the existing todo
          list. Be sure to iterate after any squashing or fixups are placed.
     @@ Documentation/git-rebase.txt: provided. Otherwise an explicit `--no-reschedule-f
       INCOMPATIBLE OPTIONS
       --------------------
       
     +@@ Documentation/git-rebase.txt: are incompatible with the following options:
     +  * --empty=
     +  * --reapply-cherry-picks
     +  * --edit-todo
     ++ * --update-refs
     +  * --root when used in combination with --onto
     + 
     + In addition, the following pairs of options are incompatible:
      
       ## builtin/rebase.c ##
      @@ builtin/rebase.c: struct rebase_options {
     @@ sequencer.c: static int skip_unnecessary_picks(struct repository *r,
       }
       
      +struct todo_add_branch_context {
     -+	struct todo_list new_list;
     ++	struct todo_item *items;
     ++	size_t items_nr;
     ++	size_t items_alloc;
      +	struct strbuf *buf;
      +	struct commit *commit;
      +};
     @@ sequencer.c: static int skip_unnecessary_picks(struct repository *r,
      +{
      +	struct todo_add_branch_context *ctx = data;
      +	size_t base_offset = ctx->buf->len;
     -+	int i = ctx->new_list.nr;
      +	struct todo_item *item;
      +	char *path;
      +
     -+	ALLOC_GROW(ctx->new_list.items,
     -+		   ctx->new_list.nr + 1,
     -+		   ctx->new_list.alloc);
     -+	item = &ctx->new_list.items[i];
     ++	ALLOC_GROW(ctx->items,
     ++		   ctx->items_nr + 1,
     ++		   ctx->items_alloc);
     ++	item = &ctx->items[ctx->items_nr];
     ++	memset(item, 0, sizeof(*item));
      +
      +	/* If the branch is checked out, then leave a comment instead. */
      +	if (branch_checked_out(d->name, &path)) {
     @@ sequencer.c: static int skip_unnecessary_picks(struct repository *r,
      +			    d->name, path);
      +		free(path);
      +	} else {
     -+		item->command = TODO_EXEC;
     -+		strbuf_addf(ctx->buf, "git update-ref %s HEAD %s\n",
     -+			    d->name, oid_to_hex(&ctx->commit->object.oid));
     ++		item->command = TODO_LABEL;
     ++		strbuf_addf(ctx->buf, "for-update-refs/%s\n", d->name);
      +	}
      +
     -+	item->commit = NULL;
      +	item->offset_in_buf = base_offset;
      +	item->arg_offset = base_offset;
      +	item->arg_len = ctx->buf->len - base_offset;
     -+	ctx->new_list.nr++;
     ++	ctx->items_nr++;
      +
      +	return 0;
      +}
      +
      +/*
      + * For each 'pick' command, find out if the commit has a decoration in
     -+ * refs/heads/. If so, then add a 'git branch -f' exec command after
     -+ * that 'pick' (plus any following 'squash' or 'fixup' commands).
     ++ * refs/heads/. If so, then add a 'label for-update-refs/' command.
      + */
     -+static int todo_list_add_branch_commands(struct todo_list *todo_list)
     ++static int todo_list_add_update_ref_commands(struct todo_list *todo_list)
      +{
      +	int i;
      +	static struct string_list decorate_refs_exclude = STRING_LIST_INIT_NODUP;
     @@ sequencer.c: static int skip_unnecessary_picks(struct repository *r,
      +		.exclude_ref_config_pattern = &decorate_refs_exclude_config
      +	};
      +	struct todo_add_branch_context ctx = {
     -+		.new_list = TODO_LIST_INIT,
      +		.buf = &todo_list->buf,
      +	};
      +
     ++	ctx.items_alloc = 2 * todo_list->nr + 1;
     ++	ALLOC_ARRAY(ctx.items, ctx.items_alloc);
     ++
      +	string_list_append(&decorate_refs_include, "refs/heads/");
      +	load_ref_decorations(&decoration_filter, 0);
      +
      +	for (i = 0; i < todo_list->nr; ) {
      +		struct todo_item *item = &todo_list->items[i];
      +
     -+		do {
     -+			/* insert ith item into new list */
     -+			ALLOC_GROW(ctx.new_list.items,
     -+				   ctx.new_list.nr + 1,
     -+				   ctx.new_list.alloc);
     ++		/* insert ith item into new list */
     ++		ALLOC_GROW(ctx.items,
     ++			   ctx.items_nr + 1,
     ++			   ctx.items_alloc);
      +
     -+			memcpy(&ctx.new_list.items[ctx.new_list.nr++],
     -+			       &todo_list->items[i],
     -+			       sizeof(struct todo_item));
     ++		ctx.items[ctx.items_nr++] = todo_list->items[i++];
      +
     -+			i++;
     -+		} while (i < todo_list->nr &&
     -+			 todo_list->items[i].command != TODO_PICK);
     -+
     -+		ctx.commit = item->commit;
     -+		for_each_decoration(item->commit, add_branch_for_decoration, &ctx);
     ++		if (item->commit) {
     ++			ctx.commit = item->commit;
     ++			for_each_decoration(item->commit,
     ++					    add_branch_for_decoration,
     ++					    &ctx);
     ++		}
      +	}
      +
     ++	/* Add the "update-refs" step. */
     ++	ALLOC_GROW(ctx.items,
     ++		   ctx.items_nr + 1,
     ++		   ctx.items_alloc);
     ++	memset(&ctx.items[ctx.items_nr], 0, sizeof(struct todo_item));
     ++	ctx.items[ctx.items_nr].command = TODO_UPDATE_REFS;
     ++	ctx.items_nr++;
     ++
      +	free(todo_list->items);
     -+	todo_list->items = ctx.new_list.items;
     -+	todo_list->nr = ctx.new_list.nr;
     -+	todo_list->alloc = ctx.new_list.alloc;
     ++	todo_list->items = ctx.items;
     ++	todo_list->nr = ctx.items_nr;
     ++	todo_list->alloc = ctx.items_alloc;
      +
      +	return 0;
      +}
     @@ sequencer.c: static int skip_unnecessary_picks(struct repository *r,
       		    const char *shortrevisions, const char *onto_name,
       		    struct commit *onto, const struct object_id *orig_head,
       		    struct string_list *commands, unsigned autosquash,
     -+		    unsigned keep_decorations,
     ++		    unsigned update_refs,
       		    struct todo_list *todo_list)
       {
       	char shortonto[GIT_MAX_HEXSZ + 1];
      @@ sequencer.c: int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
     - 	if (autosquash && todo_list_rearrange_squash(todo_list))
     - 		return -1;
     + 		item->arg_len = item->arg_offset = item->flags = item->offset_in_buf = 0;
     + 	}
       
     -+	if (keep_decorations && todo_list_add_branch_commands(todo_list))
     ++	if (update_refs && todo_list_add_update_ref_commands(todo_list))
      +		return -1;
      +
     - 	if (commands->nr)
     - 		todo_list_add_exec_commands(todo_list, commands);
     + 	if (autosquash && todo_list_rearrange_squash(todo_list))
     + 		return -1;
       
      
       ## sequencer.h ##
     @@ sequencer.h: int complete_action(struct repository *r, struct replay_opts *opts,
       		    const char *shortrevisions, const char *onto_name,
       		    struct commit *onto, const struct object_id *orig_head,
       		    struct string_list *commands, unsigned autosquash,
     -+		    unsigned keep_decorations,
     ++		    unsigned update_refs,
       		    struct todo_list *todo_list);
       int todo_list_rearrange_squash(struct todo_list *todo_list);
       
     @@ t/t3404-rebase-interactive.sh: test_expect_success 'ORIG_HEAD is updated correct
       	test_cmp_rev ORIG_HEAD test-orig-head@{1}
       '
       
     -+test_expect_success '--update-refs adds git branch commands' '
     ++test_expect_success '--update-refs adds label and update-ref commands' '
      +	git checkout -b update-refs no-conflict-branch &&
     -+	test_commit extra fileX &&
     -+	git commit --amend --fixup=L &&
     ++	git branch -f base HEAD~4 &&
     ++	git branch -f first HEAD~3 &&
     ++	git branch -f second HEAD~3 &&
     ++	git branch -f third HEAD~1 &&
     ++	git commit --allow-empty --fixup=third &&
     ++	git branch -f shared-tip &&
      +	(
      +		set_cat_todo_editor &&
     -+		git branch -f base A &&
     -+		git branch -f first J &&
     -+		git branch -f second J &&
     -+		git branch -f third L &&
     -+
     -+		test_must_fail git rebase -i --autosquash --update-refs primary >todo &&
      +
      +		cat >expect <<-EOF &&
      +		pick $(git log -1 --format=%h J) J
     -+		exec git update-ref refs/heads/second HEAD $(git rev-parse J)
     -+		exec git update-ref refs/heads/first HEAD $(git rev-parse  J)
     ++		label for-update-refs/refs/heads/second
     ++		label for-update-refs/refs/heads/first
      +		pick $(git log -1 --format=%h K) K
      +		pick $(git log -1 --format=%h L) L
     -+		fixup $(git log -1 --format=%h update-refs) fixup! L
     -+		exec git update-ref refs/heads/third HEAD $(git rev-parse L)
     ++		fixup $(git log -1 --format=%h update-refs) fixup! L # empty
     ++		label for-update-refs/refs/heads/third
      +		pick $(git log -1 --format=%h M) M
     -+		exec git update-ref refs/heads/no-conflict-branch HEAD $(git rev-parse M)
     ++		label for-update-refs/refs/heads/no-conflict-branch
     ++		label for-update-refs/refs/heads/shared-tip
     ++		update-refs
      +		EOF
      +
     ++		test_must_fail git rebase -i --autosquash --update-refs primary >todo &&
     ++		test_cmp expect todo
     ++	)
     ++'
     ++
     ++test_expect_success '--update-refs adds commands with --rebase-merges' '
     ++	git checkout -b update-refs-with-merge no-conflict-branch &&
     ++	git branch -f base HEAD~4 &&
     ++	git branch -f first HEAD~3 &&
     ++	git branch -f second HEAD~3 &&
     ++	git branch -f third HEAD~1 &&
     ++	git merge -m merge branch2 &&
     ++	git branch -f merge-branch &&
     ++	git commit --fixup=third --allow-empty &&
     ++	(
     ++		set_cat_todo_editor &&
     ++
     ++		cat >expect <<-EOF &&
     ++		label onto
     ++		reset onto
     ++		pick $(git log -1 --format=%h branch2~1) F
     ++		pick $(git log -1 --format=%h branch2) I
     ++		label for-update-refs/refs/heads/branch2
     ++		label merge
     ++		reset onto
     ++		pick $(git log -1 --format=%h refs/heads/second) J
     ++		label for-update-refs/refs/heads/second
     ++		label for-update-refs/refs/heads/first
     ++		pick $(git log -1 --format=%h refs/heads/third~1) K
     ++		pick $(git log -1 --format=%h refs/heads/third) L
     ++		fixup $(git log -1 --format=%h update-refs-with-merge) fixup! L # empty
     ++		label for-update-refs/refs/heads/third
     ++		pick $(git log -1 --format=%h HEAD~2) M
     ++		label for-update-refs/refs/heads/no-conflict-branch
     ++		merge -C $(git log -1 --format=%h HEAD~1) merge # merge
     ++		label for-update-refs/refs/heads/merge-branch
     ++		update-refs
     ++		EOF
     ++
     ++		test_must_fail git rebase -i --autosquash \
     ++				   --rebase-merges=rebase-cousins \
     ++				   --update-refs primary >todo &&
     ++
      +		test_cmp expect todo
      +	)
      +'
 -:  ----------- > 6:  68f8e51b19c sequencer: implement 'update-refs' command
 4:  b99c5bf34ef ! 7:  3d7d3f656b4 rebase: add rebase.updateRefs config option
     @@ builtin/rebase.c: static int rebase_config(const char *var, const char *value, v
       		return 0;
      
       ## t/t3404-rebase-interactive.sh ##
     -@@ t/t3404-rebase-interactive.sh: test_expect_success '--update-refs adds git branch commands' '
     - 		exec git update-ref refs/heads/no-conflict-branch HEAD $(git rev-parse M)
     +@@ t/t3404-rebase-interactive.sh: test_expect_success '--update-refs adds label and update-ref commands' '
       		EOF
       
     + 		test_must_fail git rebase -i --autosquash --update-refs primary >todo &&
      +		test_cmp expect todo &&
      +
      +		test_must_fail git -c rebase.autosquash=true \
      +				   -c rebase.updaterefs=true \
      +				   rebase -i primary >todo &&
     ++
     + 		test_cmp expect todo
     + 	)
     + '
     +@@ t/t3404-rebase-interactive.sh: test_expect_success '--update-refs adds commands with --rebase-merges' '
     + 				   --rebase-merges=rebase-cousins \
     + 				   --update-refs primary >todo &&
     + 
     ++		test_cmp expect todo &&
     ++
     ++		test_must_fail git -c rebase.autosquash=true \
     ++				   -c rebase.updaterefs=true \
     ++				   rebase -i \
     ++				   --rebase-merges=rebase-cousins \
     ++				   primary >todo &&
     ++
       		test_cmp expect todo
       	)
       '

Junio C Hamano June 7, 2022, 10:11 p.m. UTC | #1

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The todo_command_info array defines which strings match with which
> todo_command enum values. The array is defined in the same order as the
> enum values, but if one changed without the other, then we would have
> unexpected results.
>
> Make it easier to see changes to the enum and this array by using the
> enum values as the indices of the array.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  sequencer.c | 28 ++++++++++++++--------------
>  1 file changed, 14 insertions(+), 14 deletions(-)

Good thinking.

I love seeing a future-proofing change like this.

Thanks.

> diff --git a/sequencer.c b/sequencer.c
> index 8c3ed3532ac..8e26c9a6261 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1685,20 +1685,20 @@ static struct {
>  	char c;
>  	const char *str;
>  } todo_command_info[] = {
> -	{ 'p', "pick" },
> -	{ 0,   "revert" },
> -	{ 'e', "edit" },
> -	{ 'r', "reword" },
> -	{ 'f', "fixup" },
> -	{ 's', "squash" },
> -	{ 'x', "exec" },
> -	{ 'b', "break" },
> -	{ 'l', "label" },
> -	{ 't', "reset" },
> -	{ 'm', "merge" },
> -	{ 0,   "noop" },
> -	{ 'd', "drop" },
> -	{ 0,   NULL }
> +	[TODO_PICK] = { 'p', "pick" },
> +	[TODO_REVERT] = { 0,   "revert" },
> +	[TODO_EDIT] = { 'e', "edit" },
> +	[TODO_REWORD] = { 'r', "reword" },
> +	[TODO_FIXUP] = { 'f', "fixup" },
> +	[TODO_SQUASH] = { 's', "squash" },
> +	[TODO_EXEC] = { 'x', "exec" },
> +	[TODO_BREAK] = { 'b', "break" },
> +	[TODO_LABEL] = { 'l', "label" },
> +	[TODO_RESET] = { 't', "reset" },
> +	[TODO_MERGE] = { 'm', "merge" },
> +	[TODO_NOOP] = { 0,   "noop" },
> +	[TODO_DROP] = { 'd', "drop" },
> +	[TODO_COMMENT] = { 0,   NULL },
>  };
>  
>  static const char *command_to_string(const enum todo_command command)

Junio C Hamano June 7, 2022, 10:23 p.m. UTC | #2

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> The previous change allowed 'git rebase --update-refs' to create 'label'
> commands for each branch  among the commits being rewritten and add an
> 'update-refs' command at the end of the todo list. Now, teach Git to
> update the refs during that final 'update-refs' command.
>
> We need to create an array of new and old OIDs for each ref by iterating
> over the refs/rewritten/for-update-refs/ namespace. We cannot update the
> refs in-place since this will confuse the refs iterator.

In other words, grab everything we need to do and then use that
in-core information to tell refs API what refs to change to what
values?  That dounds like a quite reasonable thing to do.

Looking at the patch text, the only thing that stands out at me is
that this does not seem to perform the updates in a single
transaction, which may often not matter in practice, but may be a
prudent thing to do anyway at philosophical level.

Thanks, will queue.

Phillip Wood June 8, 2022, 2:32 p.m. UTC | #3

Hi Stolee

On 07/06/2022 21:42, Derrick Stolee via GitGitGadget wrote:
> [...]
> As an example, here is my in-progress bundle URI RFC split into subtopics as
> they appear during the TODO list of a git rebase -i --update-refs:
> 
> pick 2d966282ff3 docs: document bundle URI standard
> pick 31396e9171a remote-curl: add 'get' capability
> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
> pick 6adaf842684 fetch: add --bundle-uri option
> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
> label for-update-refs/refs/heads/bundle-redo/fetch
> [...] 
> update-refs
> [...]
> Based on some of the discussion, it seemed like one way to do this would be
> to have an 'update-ref ' command that would take the place of these 'label'
> commands. However, this would require two things that make it a bit awkward:
> 
>   1. We would need to replicate the storage of those positions during the
>      rebase. 'label' already does this pretty well. I've added the
>      "for-update-refs/" label to help here.

I'm afraid I don't think that using a label with a name constructed from 
a magic prefix and the full refname is a good user interface.

(i) Having labels behave differently based on their name is confusing.

(ii) The length of the label string is cumbersome for users. Rebase 
already has a reputation for being unfriendly and hard to use, this will 
not improve that. "update-ref bundle-redo/fetch" is much simpler.

(iii) It means that we no longer store the oid of each branch when 
creating the todo list and so cannot check if it is safe to update it at 
the end of the rebase (just using the current value of the branch ref at 
the end of a long running operation like rebase is not sufficient to be 
safe).

>   2. If we want to close out all of the refs as the rebase is finishing, then
>      that "step" becomes invisible to the user (and a bit more complicated to
>      insert). Thus, the 'update-refs' step performs this action. If the user
>      wants to do things after that step, then they can do so by editing the
>      TODO list.

I'm not sure what the use case is for making the update-refs step 
visible to the user - it seems to be added complexity for them to deal 
with. We don't do that for the branch that is being rebased so why do we 
need to do it for the other branches? As for the implementation we could 
just update the branches at the end of the loop in pick_commits() where 
we update the branch and run the post-rewrite hook etc. there is no need 
for an entry in the todo list.

Best Wishes

Phillip

Derrick Stolee June 8, 2022, 6:09 p.m. UTC | #4

On 6/8/2022 10:32 AM, Phillip Wood wrote:
> Hi Stolee
> 
> On 07/06/2022 21:42, Derrick Stolee via GitGitGadget wrote:
>> [...]
>> As an example, here is my in-progress bundle URI RFC split into subtopics as
>> they appear during the TODO list of a git rebase -i --update-refs:
>>
>> pick 2d966282ff3 docs: document bundle URI standard
>> pick 31396e9171a remote-curl: add 'get' capability
>> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
>> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
>> pick 6adaf842684 fetch: add --bundle-uri option
>> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
>> label for-update-refs/refs/heads/bundle-redo/fetch
>> [...] update-refs
>> [...]
>> Based on some of the discussion, it seemed like one way to do this would be
>> to have an 'update-ref ' command that would take the place of these 'label'
>> commands. However, this would require two things that make it a bit awkward:
>>
>>   1. We would need to replicate the storage of those positions during the
>>      rebase. 'label' already does this pretty well. I've added the
>>      "for-update-refs/" label to help here.
> 
> I'm afraid I don't think that using a label with a name constructed from
> a magic prefix and the full refname is a good user interface.
> 
> (i) Having labels behave differently based on their name is confusing.

The label commands do as advertised, but the 'update-refs' command does
something with the set of refs based on their names, yes. We would need to
store this information during the rebase _somewhere_, and refs/rewritten/
seems natural.

> (ii) The length of the label string is cumbersome for users. Rebase already
> has a reputation for being unfriendly and hard to use, this will not improve
> that. "update-ref bundle-redo/fetch" is much simpler.

I agree that your proposed replacement for "label for-update-refs/..." would
look cleaner. I don't think that that is enough to justify hiding information
from the user.

> (iii) It means that we no longer store the oid of each branch when creating> the todo list and so cannot check if it is safe to update it at the end of the
> rebase (just using the current value of the branch ref at the end of a long
> running operation like rebase is not sufficient to be safe).

The operation we are doing necessarily requires a force-update, since we are
modifying history. The safety you are caring about here is about the case where
the user modifies one of these refs between the initial 'git rebase --update-refs'
and the final 'git rebase --continue' that actually writes the refs. You want to
prevent that final update from succeeding because another change to those refs
could be lost.

I'm less concerned about this case, because the user is requesting that we
update the refs pointing to this set of commits. But, I'm glad you brought it
up.

One way to prevent this situation would be to extend the idea of "what branch
is being rebased?" (REBASE_HEAD) to "which branches are being rewritten?"
(REBASE_HEADS?). If the worktree kept the full list somewhere in the $GIT_DIR,
then other Git commands could notice that those refs are currently being
overwritten and those updates should fail (similarly to how we currently fail
to update a branch being rebased). This ties into your later point:

>>   2. If we want to close out all of the refs as the rebase is finishing, then
>>      that "step" becomes invisible to the user (and a bit more complicated to
>>      insert). Thus, the 'update-refs' step performs this action. If the user
>>      wants to do things after that step, then they can do so by editing the
>>      TODO list.
> 
> I'm not sure what the use case is for making the update-refs step visible to
> the user - it seems to be added complexity for them to deal with. We don't do
> that for the branch that is being rebased so why do we need to do it for the
> other branches? As for the implementation we could just update the branches
> at the end of the loop in pick_commits() where we update the branch and run the
> post-rewrite hook etc. there is no need for an entry in the todo list.

I concede that allowing the user to move the 'update-refs' command around is
not super important.

The biggest concern I originally had with this idea was that there was no
connection between the "--update-refs" option in the first "git rebase"
command and the final "git rebase --continue" command. That is, other than
which refs are in refs/rewritten/*.

HOWEVER, using refs/rewritten/* is likely buggy: if we had two rebases
happening at the same time (operating on a disjoint set of refs), then how
do we differentiate which refs make sense for us to update as we complete
this rebase?

So, I'm coming to the conclusion that using refs/rewritten/* is problematic
and I should use something closer to REBASE_HEAD as a way to describe which
refs are being updated. In that context, your 'update-ref <ref>' command
makes a lot more sense. The user can decide to move those around _or_ remove
them; it won't change their protection under "REBASE_HEADS" but would
prevent them from being rewritten.

While I think on this, I'll send my branch_checked_out() patches in a
separate thread, since those have grown in complexity since this version.

Thanks,
-Stolee

Phillip Wood June 9, 2022, 10:04 a.m. UTC | #5

Hi Stolee

On 08/06/2022 19:09, Derrick Stolee wrote:
> On 6/8/2022 10:32 AM, Phillip Wood wrote:
>> Hi Stolee
>>
>> On 07/06/2022 21:42, Derrick Stolee via GitGitGadget wrote:
>>> [...]
>>> As an example, here is my in-progress bundle URI RFC split into subtopics as
>>> they appear during the TODO list of a git rebase -i --update-refs:
>>>
>>> pick 2d966282ff3 docs: document bundle URI standard
>>> pick 31396e9171a remote-curl: add 'get' capability
>>> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
>>> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
>>> pick 6adaf842684 fetch: add --bundle-uri option
>>> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
>>> label for-update-refs/refs/heads/bundle-redo/fetch
>>> [...] update-refs
>>> [...]
>>> Based on some of the discussion, it seemed like one way to do this would be
>>> to have an 'update-ref ' command that would take the place of these 'label'
>>> commands. However, this would require two things that make it a bit awkward:
>>>
>>>    1. We would need to replicate the storage of those positions during the
>>>       rebase. 'label' already does this pretty well. I've added the
>>>       "for-update-refs/" label to help here.
>>
>> I'm afraid I don't think that using a label with a name constructed from
>> a magic prefix and the full refname is a good user interface.
>>
>> (i) Having labels behave differently based on their name is confusing.
> 
> The label commands do as advertised, but the 'update-refs' command does
> something with the set of refs based on their names, yes. We would need to
> store this information during the rebase _somewhere_, and refs/rewritten/
> seems natural.
>   
>> (ii) The length of the label string is cumbersome for users. Rebase already
>> has a reputation for being unfriendly and hard to use, this will not improve
>> that. "update-ref bundle-redo/fetch" is much simpler.
> 
> I agree that your proposed replacement for "label for-update-refs/..." would
> look cleaner. I don't think that that is enough to justify hiding information
> from the user.

In general I think it is better not to burden the user with unnecessary 
information - it makes the ui more complicated and exposes 
implementation details which makes it harder to change the implementation.

>> (iii) It means that we no longer store the oid of each branch when creating> the todo list and so cannot check if it is safe to update it at the end of the
>> rebase (just using the current value of the branch ref at the end of a long
>> running operation like rebase is not sufficient to be safe).
> 
> The operation we are doing necessarily requires a force-update, since we are
> modifying history. The safety you are caring about here is about the case where
> the user modifies one of these refs between the initial 'git rebase --update-refs'
> and the final 'git rebase --continue' that actually writes the refs. You want to
> prevent that final update from succeeding because another change to those refs
> could be lost.
> 
> I'm less concerned about this case, because the user is requesting that we
> update the refs pointing to this set of commits. But, I'm glad you brought it
> up.

At the end of the rebase we pass the oid of the branch that we store at 
the start of the rebase when updating the ref to avoid nasty surprises. 
I think it makes sense to do the same for the other branches being 
rewritten - it is easy to get stuck on a conflict resolution and go and 
do something else for a while and forget a branch is being rebased. If 
we cannot update the ref at the end of the rebase we should print the 
new oid with instructions to run "git reset --hard" or "git update-ref" 
if the user wants to update the branch.

> One way to prevent this situation would be to extend the idea of "what branch
> is being rebased?" (REBASE_HEAD) to "which branches are being rewritten?"
> (REBASE_HEADS?). 

Just to clarify REBASE_HEAD points to the commit we're currently trying 
to pick, the branch ref is stored it .git/rebase-merge/head-name and its 
oid in .git/rebase-merge/orig-head

> If the worktree kept the full list somewhere in the $GIT_DIR,
> then other Git commands could notice that those refs are currently being
> overwritten and those updates should fail (similarly to how we currently fail
> to update a branch being rebased).

That's a good point and one that I should have remembered as I 
implemented that check in my script. I agree that we should stop the 
user starting a rebase in a worktree whose branch is being updated 
elsewhere. If we write a list of the refs we want to update under 
.git/rebase-merge before walking the other worktrees to see what's being 
rebased elsewhere that avoids a race where two processes start that try 
to rebase the same branch in different worktrees at the same time. It 
would be easy to store the original oids with the ref names.

An added complexity is that we would have to check the todo list for any 
new update-ref commands and check those refs are not being rebased 
elsewhere each time the list is edited. In the future we should update 
"git status" to read that file but I don't think that needs to be part 
of the initial implementation.

> This ties into your later point:
> 
>>>    2. If we want to close out all of the refs as the rebase is finishing, then
>>>       that "step" becomes invisible to the user (and a bit more complicated to
>>>       insert). Thus, the 'update-refs' step performs this action. If the user
>>>       wants to do things after that step, then they can do so by editing the
>>>       TODO list.
>>
>> I'm not sure what the use case is for making the update-refs step visible to
>> the user - it seems to be added complexity for them to deal with. We don't do
>> that for the branch that is being rebased so why do we need to do it for the
>> other branches? As for the implementation we could just update the branches
>> at the end of the loop in pick_commits() where we update the branch and run the
>> post-rewrite hook etc. there is no need for an entry in the todo list.
> 
> I concede that allowing the user to move the 'update-refs' command around is
> not super important.
> 
> The biggest concern I originally had with this idea was that there was no
> connection between the "--update-refs" option in the first "git rebase"
> command and the final "git rebase --continue" command. That is, other than
> which refs are in refs/rewritten/*.
> 
> HOWEVER, using refs/rewritten/* is likely buggy: if we had two rebases
> happening at the same time (operating on a disjoint set of refs), then how
> do we differentiate which refs make sense for us to update as we complete
> this rebase?
 >> So, I'm coming to the conclusion that using refs/rewritten/* is 
problematic
> and I should use something closer to REBASE_HEAD as a way to describe which
> refs are being updated. In that context, your 'update-ref <ref>' command
> makes a lot more sense. The user can decide to move those around _or_ remove
> them; it won't change their protection under "REBASE_HEADS" but would
> prevent them from being rewritten.

Whenever one adds a new option to rebase it almost always involves 
writing more state to .git/rebase-merge, in this case I think we want to 
store the branches from the update-ref commands and their original oids 
there. I'm not sure that we want to expose the file to the user though 
so we can change the implementation in the future if we need to (e.g. we 
may decide it is better to store this information for all worktrees 
under a single file in $GIT_COMMON_DIR).

> While I think on this, I'll send my branch_checked_out() patches in a
> separate thread, since those have grown in complexity since this version.

If you want to discuss any ideas face-to-face drop me a email and we can 
arrange a time to chat if that would be useful.

Best Wishes

Phillip

> -Stolee

[v2,0/7] rebase: update branches in multi-part topic

Message

Comments