[0/4] rebase: update branches in multi-part topic

Message ID	pull.1247.git.1654263472.gitgitgadget@gmail.com (mailing list archive)
Headers	show Return-Path: <git-owner@kernel.org> Message-Id: <pull.1247.git.1654263472.gitgitgadget@gmail.com> From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> Date: Fri, 03 Jun 2022 13:37:48 +0000 Subject: [PATCH 0/4] rebase: update branches in multi-part topic Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: gitster@pobox.com, johannes.schindelin@gmx.de, me@ttaylorr.com, Derrick Stolee <derrickstolee@github.com> Precedence: bulk
Series	rebase: update branches in multi-part topic \| expand [0/4] rebase: update branches in multi-part topic [1/4] log-tree: create for_each_decoration() [2/4] branch: add branch_checked_out() helper [3/4] rebase: add --update-refs option [4/4] rebase: add rebase.updateRefs config option

Kazuhiro Kato via GitGitGadget June 3, 2022, 1:37 p.m. UTC

This is a feature I've wanted for quite a while. When working on the sparse
index topic, I created a long RFC that actually broke into three topics for
full review upstream. These topics were sequential, so any feedback on an
earlier one required updates to the later ones. I would work on the full
feature and use interactive rebase to update the full list of commits.
However, I would need to update the branches pointing to those sub-topics.

This series adds a new --update-refs option to 'git rebase' (along with a
rebase.updateRefs config option) that adds 'git update-ref' commands into
the TODO list. This is powered by the commit decoration machinery.

As an example, here is my in-progress bundle URI RFC split into subtopics as
they appear during the TODO list of a git rebase -i --update-refs:

pick 2d966282ff3 docs: document bundle URI standard
pick 31396e9171a remote-curl: add 'get' capability
pick 54c6ab70f67 bundle-uri: create basic file-copy logic
pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
pick 6adaf842684 fetch: add --bundle-uri option
pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730

pick 1e3f6546632 clone: add --bundle-uri option
pick 9e4a6fe9b68 clone: --bundle-uri cannot be combined with --depth
exec git update-ref refs/heads/bundle-redo/clone HEAD 9e4a6fe9b68a8455b427c9ac8cdbff30c96653b4

pick 5451cb6599c bundle-uri: create bundle_list struct and helpers
pick 3029c3aca15 bundle-uri: create base key-value pair parsing
pick a8b2de79ce8 bundle-uri: create "key=value" line parsing
pick 92625a47673 bundle-uri: unit test "key=value" parsing
pick a8616af4dc2 bundle-uri: limit recursion depth for bundle lists
pick 9d6809a8d53 bundle-uri: parse bundle list in config format
pick 287a732b54c bundle-uri: fetch a list of bundles
exec git update-ref refs/heads/bundle-redo/list HEAD 287a732b54c4d95e7f410b3b36ef90d8a19cd346

pick b09f8226185 protocol v2: add server-side "bundle-uri" skeleton
pick 520204dcd1c bundle-uri client: add minimal NOOP client
pick 62e8b457b48 bundle-uri client: add "git ls-remote-bundle-uri"
pick 00eae925043 bundle-uri: serve URI advertisement from bundle.* config
pick 4277440a250 bundle-uri client: add boolean transfer.bundleURI setting
pick caf4599a81d bundle-uri: allow relative URLs in bundle lists
pick df255000b7e bundle-uri: download bundles from an advertised list
pick d71beabf199 clone: unbundle the advertised bundles
pick c9578391976 t5601: basic bundle URI tests
# Ref refs/heads/bundle-redo/rfc-3 checked out at '/home/stolee/_git/git-bundles'

exec git update-ref refs/heads/bundle-redo/advertise HEAD c9578391976ab9899c4e4f9b5fa2827650097305


The first two patches are helpers that are needed, but the full logic of the
--update-refs option is introduced in patch 3. The config option is
available in patch 4.

Thanks, -Stolee

Derrick Stolee (4):
  log-tree: create for_each_decoration()
  branch: add branch_checked_out() helper
  rebase: add --update-refs option
  rebase: add rebase.updateRefs config option

 Documentation/config/rebase.txt |   3 +
 Documentation/git-rebase.txt    |  11 ++++
 branch.c                        |  24 ++++---
 branch.h                        |   8 +++
 builtin/rebase.c                |  10 +++
 log-tree.c                      | 111 ++++++++++++++++++++++----------
 log-tree.h                      |   4 ++
 sequencer.c                     |  99 ++++++++++++++++++++++++++++
 sequencer.h                     |   1 +
 t/t3404-rebase-interactive.sh   |  34 ++++++++++
 10 files changed, 262 insertions(+), 43 deletions(-)


base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1247%2Fderrickstolee%2Frebase-keep-decorations-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1247/derrickstolee/rebase-keep-decorations-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1247

Junio C Hamano June 3, 2022, 4:56 p.m. UTC | #1

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This is a feature I've wanted for quite a while. When working on the sparse
> index topic, I created a long RFC that actually broke into three topics for
> full review upstream. These topics were sequential, so any feedback on an
> earlier one required updates to the later ones. I would work on the full
> feature and use interactive rebase to update the full list of commits.
> However, I would need to update the branches pointing to those sub-topics.
>
> This series adds a new --update-refs option to 'git rebase' (along with a
> rebase.updateRefs config option) that adds 'git update-ref' commands into
> the TODO list. This is powered by the commit decoration machinery.

;-)

Taylor Blau June 3, 2022, 6:27 p.m. UTC | #2

On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
> This is a feature I've wanted for quite a while. When working on the sparse
> index topic, I created a long RFC that actually broke into three topics for
> full review upstream. These topics were sequential, so any feedback on an
> earlier one required updates to the later ones. I would work on the full
> feature and use interactive rebase to update the full list of commits.
> However, I would need to update the branches pointing to those sub-topics.

This is really exciting. I'm glad that you're working on it, because I
have also wanted this feature a handful of times over the years.

This definitely would have come in handy when writing MIDX bitmaps,
where I was often editing a local branch pointing at the final topic,
and then trying to reconstruct all of the intermediate branches after
each rebase. Not ever having to do that again sounds like a delight ;-).

> pick 2d966282ff3 docs: document bundle URI standard
> pick 31396e9171a remote-curl: add 'get' capability
> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
> pick 6adaf842684 fetch: add --bundle-uri option
> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
> exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730

But I wonder if we can or should delay these update-refs as long as
possible. In particular: what happens if I get past this "exec" line (so
that I've already updated my bundle-redo/fetch branch to point at the
new thing), but decide at some later point to abort the rebase?

I think users will expect us to restore bundle-redo/fetch to where it
was before if we end up in that case. Recovering from it manually sounds
like kind of a headache.

What if instead we created labels here, and then delayed all ref updates
to the end by replacing this with:

    label bundle-redo/fetch

and then at the end of the todo list we'd add:

    exec git update-ref refs/heads/bundle-redo/fetch refs/rewritten/bundle-redo/fetch

If we do all of those ref updates in a single transaction at the end, it
should be much easier to roll back from if desired, and we'd avoid the
aborted-rebase problem entirely.

Thanks,
Taylor

Junio C Hamano June 3, 2022, 6:52 p.m. UTC | #3

Taylor Blau <me@ttaylorr.com> writes:

> But I wonder if we can or should delay these update-refs as long as
> possible. In particular: what happens if I get past this "exec" line (so
> that I've already updated my bundle-redo/fetch branch to point at the
> new thing), but decide at some later point to abort the rebase?
>
> I think users will expect us to restore bundle-redo/fetch to where it
> was before if we end up in that case. Recovering from it manually sounds
> like kind of a headache.

That is a very good safety and usability concern.  I am glad
somebody thought of it.

> What if instead we created labels here, and then delayed all ref updates
> to the end by replacing this with:
>
>     label bundle-redo/fetch
>
> and then at the end of the todo list we'd add:
>
>     exec git update-ref refs/heads/bundle-redo/fetch refs/rewritten/bundle-redo/fetch
>
> If we do all of those ref updates in a single transaction at the end, it
> should be much easier to roll back from if desired, and we'd avoid the
> aborted-rebase problem entirely.

;-)

Jeff Hostetler June 3, 2022, 7:59 p.m. UTC | #4

On 6/3/22 2:27 PM, Taylor Blau wrote:
> On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
>> This is a feature I've wanted for quite a while. When working on the sparse
>> index topic, I created a long RFC that actually broke into three topics for
>> full review upstream. These topics were sequential, so any feedback on an
>> earlier one required updates to the later ones. I would work on the full
>> feature and use interactive rebase to update the full list of commits.
>> However, I would need to update the branches pointing to those sub-topics.
> 
> This is really exciting. I'm glad that you're working on it, because I
> have also wanted this feature a handful of times over the years.
> 
> This definitely would have come in handy when writing MIDX bitmaps,
> where I was often editing a local branch pointing at the final topic,
> and then trying to reconstruct all of the intermediate branches after
> each rebase. Not ever having to do that again sounds like a delight ;-).
> 
>> pick 2d966282ff3 docs: document bundle URI standard
>> pick 31396e9171a remote-curl: add 'get' capability
>> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
>> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
>> pick 6adaf842684 fetch: add --bundle-uri option
>> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
>> exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730
> 
> But I wonder if we can or should delay these update-refs as long as
> possible. In particular: what happens if I get past this "exec" line (so
> that I've already updated my bundle-redo/fetch branch to point at the
> new thing), but decide at some later point to abort the rebase?
> 
> I think users will expect us to restore bundle-redo/fetch to where it
> was before if we end up in that case. Recovering from it manually sounds
> like kind of a headache.
> 
> What if instead we created labels here, and then delayed all ref updates
> to the end by replacing this with:
> 
>      label bundle-redo/fetch
> 
> and then at the end of the todo list we'd add:
> 
>      exec git update-ref refs/heads/bundle-redo/fetch refs/rewritten/bundle-redo/fetch
> 
> If we do all of those ref updates in a single transaction at the end, it
> should be much easier to roll back from if desired, and we'd avoid the
> aborted-rebase problem entirely.
> 
> Thanks,
> Taylor
> 

I agree. I could have really used this while juggling all of the
parts of FSMonitor recently.  And yes, it should write the updates
at the bottom in case of an abort.

Should this take a branch pattern/regex to limit the set of branches
that are updated (or offered to be updated)?  For example, if I have
an intermediate commit in the series that has 2 branch names pointing
at it, do we want to offer to update both of them or only the one
that matches some pattern related to the tip?  Or is it sufficient to
just enumerate them at the bottom of the todo list and let the user
delete the lines they don't want?

Should we actually do the update-ref's or should we write a script
that lets the user do it later?  The latter would let us also write
out the commands to force update the remote refs if that would be
helpful.

Thanks,
Jeff

Taylor Blau June 3, 2022, 8:03 p.m. UTC | #5

On Fri, Jun 03, 2022 at 03:59:00PM -0400, Jeff Hostetler wrote:
> Should this take a branch pattern/regex to limit the set of branches
> that are updated (or offered to be updated)?  For example, if I have
> an intermediate commit in the series that has 2 branch names pointing
> at it, do we want to offer to update both of them or only the one
> that matches some pattern related to the tip?  Or is it sufficient to
> just enumerate them at the bottom of the todo list and let the user
> delete the lines they don't want?

That could be a useful feature to layer on top. I don't think it's
required here, though, since (as you note) users can remove lines from
the update-ref invocation(s) at the bottom that they don't want to
perform.

> Should we actually do the update-ref's or should we write a script
> that lets the user do it later?  The latter would let us also write
> out the commands to force update the remote refs if that would be
> helpful.

That seems like it would be outside the bounds of what `rebase` should
provide, but others may feel differently.

Thanks,
Taylor

Junio C Hamano June 3, 2022, 9:23 p.m. UTC | #6

Jeff Hostetler <git@jeffhostetler.com> writes:

> I agree. I could have really used this while juggling all of the
> parts of FSMonitor recently.  And yes, it should write the updates
> at the bottom in case of an abort.
>
> Should this take a branch pattern/regex to limit the set of branches
> that are updated (or offered to be updated)?  For example, if I have
> an intermediate commit in the series that has 2 branch names pointing
> at it, do we want to offer to update both of them or only the one
> that matches some pattern related to the tip?  Or is it sufficient to
> just enumerate them at the bottom of the todo list and let the user
> delete the lines they don't want?

The latter sounds sufficient (starting from something simpler should
work well in this case).

> Should we actually do the update-ref's or should we write a script
> that lets the user do it later?  The latter would let us also write
> out the commands to force update the remote refs if that would be
> helpful.

Aren't we writing "a script" already by implementing it as an
additional "exec git update-ref" in the todo list already?  I
found that, combined with the idea to use the "label" Taylor
mentioned, was the most brilliant part of this proposal.

Phillip Wood June 4, 2022, 3:28 p.m. UTC | #7

On 03/06/2022 19:27, Taylor Blau wrote:
> On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
>> This is a feature I've wanted for quite a while. When working on the sparse
>> index topic, I created a long RFC that actually broke into three topics for
>> full review upstream. These topics were sequential, so any feedback on an
>> earlier one required updates to the later ones. I would work on the full
>> feature and use interactive rebase to update the full list of commits.
>> However, I would need to update the branches pointing to those sub-topics.
> 
> This is really exciting. I'm glad that you're working on it, because I
> have also wanted this feature a handful of times over the years.

Yes, thank you Stolee. I agree this will be useful, but I wonder if 
users will be confused that --update-refs only updates the branch heads 
that happen to be in the todo list, rather than updating all the 
branches that contain a rewritten commit. I think the latter is 
something we should try to add in the future and so we should take care 
to design this topic so that is possible.

> This definitely would have come in handy when writing MIDX bitmaps,
> where I was often editing a local branch pointing at the final topic,
> and then trying to reconstruct all of the intermediate branches after
> each rebase. Not ever having to do that again sounds like a delight ;-).
> 
>> pick 2d966282ff3 docs: document bundle URI standard
>> pick 31396e9171a remote-curl: add 'get' capability
>> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
>> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
>> pick 6adaf842684 fetch: add --bundle-uri option
>> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
>> exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730
> 
> But I wonder if we can or should delay these update-refs as long as
> possible. In particular: what happens if I get past this "exec" line (so
> that I've already updated my bundle-redo/fetch branch to point at the
> new thing), but decide at some later point to abort the rebase?

Absolutely! There is also the question of what to do if a user skips a 
commit that is a branch head. It is not obvious if they just want to 
drop that commit from the branch or if they want to skip updating the 
ref as well (I guess they can edit the todo list for the latter case).

> I think users will expect us to restore bundle-redo/fetch to where it
> was before if we end up in that case. Recovering from it manually sounds
> like kind of a headache.
> 
> What if instead we created labels here, and then delayed all ref updates
> to the end by replacing this with:
> 
>      label bundle-redo/fetch

Instead of using 'label' and 'exec' I'd prefer a new todo list command 
('update-ref' or 'update-branch'?) used in place of 'label' that takes a 
branch name and updates the branch ref at the end of the rebase. That 
would make it easy to do all the updates in a single transaction as you 
suggested. Adding exec lines to do this makes the todo list messy and we 
have been trying to stop rebase forking all the time.

Best Wishes

Phillip

> and then at the end of the todo list we'd add:
> 
>      exec git update-ref refs/heads/bundle-redo/fetch refs/rewritten/bundle-redo/fetch
> 
> If we do all of those ref updates in a single transaction at the end, it
> should be much easier to roll back from if desired, and we'd avoid the
> aborted-rebase problem entirely.

> Thanks,
> Taylor

Derrick Stolee June 6, 2022, 3:12 p.m. UTC | #8

On 6/4/2022 11:28 AM, Phillip Wood wrote:
> On 03/06/2022 19:27, Taylor Blau wrote:
>> On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
>>> This is a feature I've wanted for quite a while. When working on the sparse
>>> index topic, I created a long RFC that actually broke into three topics for
>>> full review upstream. These topics were sequential, so any feedback on an
>>> earlier one required updates to the later ones. I would work on the full
>>> feature and use interactive rebase to update the full list of commits.
>>> However, I would need to update the branches pointing to those sub-topics.
>>
>> This is really exciting. I'm glad that you're working on it, because I
>> have also wanted this feature a handful of times over the years.
> 
> Yes, thank you Stolee. I agree this will be useful, but I wonder if users
> will be confused that --update-refs only updates the branch heads that happen
> to be in the todo list, rather than updating all the branches that contain a
> rewritten commit. I think the latter is something we should try to add in the
> future and so we should take care to design this topic so that is possible.

At the moment, the design adds a comment to the TODO list, showing which
branches are not possible to move because they are checked out at another
worktree (or is currently checked out and will be updated by the rebase
itself). That seems like a good place to insert alternative logic in the
future if we see a need for better behavior here.

Unless: am I misunderstanding something about your concern here? Are you
worried about refs outside of refs/heads/*? Are you concerned about it
being _all_ refs/heads/* that we find?

One potential way to extend this (in the future) is to make --update-refs
take an optional string argument containing a refspec. This would replace
the default refspec of refs/heads/*.
 
>> This definitely would have come in handy when writing MIDX bitmaps,
>> where I was often editing a local branch pointing at the final topic,
>> and then trying to reconstruct all of the intermediate branches after
>> each rebase. Not ever having to do that again sounds like a delight ;-).
>>
>>> pick 2d966282ff3 docs: document bundle URI standard
>>> pick 31396e9171a remote-curl: add 'get' capability
>>> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
>>> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
>>> pick 6adaf842684 fetch: add --bundle-uri option
>>> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
>>> exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730
>>
>> But I wonder if we can or should delay these update-refs as long as
>> possible. In particular: what happens if I get past this "exec" line (so
>> that I've already updated my bundle-redo/fetch branch to point at the
>> new thing), but decide at some later point to abort the rebase?
> 
> Absolutely! There is also the question of what to do if a user skips a
> commit that is a branch head. It is not obvious if they just want to drop> that commit from the branch or if they want to skip updating the ref as> well (I guess they can edit the todo list for the latter case).

>> I think users will expect us to restore bundle-redo/fetch to where it
>> was before if we end up in that case. Recovering from it manually sounds
>> like kind of a headache.
>>
>> What if instead we created labels here, and then delayed all ref updates
>> to the end by replacing this with:
>>
>>      label bundle-redo/fetch
> 
> Instead of using 'label' and 'exec' I'd prefer a new todo list command
> ('update-ref' or 'update-branch'?) used in place of 'label' that takes a
> branch name and updates the branch ref at the end of the rebase. That
> would make it easy to do all the updates in a single transaction as you
> suggested. Adding exec lines to do this makes the todo list messy and we
> have been trying to stop rebase forking all the time.

Thanks. Yes, a new 'update-refs' step at the end would be good to make
it clear we want to rewrite the refs in one go without a possible
interruption from the user.

Thanks,
-Stolee

Junio C Hamano June 6, 2022, 4:36 p.m. UTC | #9

Phillip Wood <phillip.wood123@gmail.com> writes:

> Instead of using 'label' and 'exec' I'd prefer a new todo list command
> ('update-ref' or 'update-branch'?) used in place of 'label' that takes
> a branch name and updates the branch ref at the end of the
> rebase. That would make it easy to do all the updates in a single
> transaction as you suggested.

Sounds like a good approach.

Elijah Newren June 7, 2022, 6:25 a.m. UTC | #10

On Fri, Jun 3, 2022 at 11:01 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This is a feature I've wanted for quite a while. When working on the sparse
> index topic, I created a long RFC that actually broke into three topics for
> full review upstream. These topics were sequential, so any feedback on an
> earlier one required updates to the later ones. I would work on the full
> feature and use interactive rebase to update the full list of commits.
> However, I would need to update the branches pointing to those sub-topics.
>
> This series adds a new --update-refs option to 'git rebase' (along with a
> rebase.updateRefs config option) that adds 'git update-ref' commands into
> the TODO list. This is powered by the commit decoration machinery.
>
> As an example, here is my in-progress bundle URI RFC split into subtopics as
> they appear during the TODO list of a git rebase -i --update-refs:
>
> pick 2d966282ff3 docs: document bundle URI standard
> pick 31396e9171a remote-curl: add 'get' capability
> pick 54c6ab70f67 bundle-uri: create basic file-copy logic
> pick 96cb2e35af1 bundle-uri: add support for http(s):// and file://
> pick 6adaf842684 fetch: add --bundle-uri option
> pick 6c5840ed77e fetch: add 'refs/bundle/' to log.excludeDecoration
> exec git update-ref refs/heads/bundle-redo/fetch HEAD 6c5840ed77e1bc41c1fe6fb7c894ceede1b8d730
>
> pick 1e3f6546632 clone: add --bundle-uri option
> pick 9e4a6fe9b68 clone: --bundle-uri cannot be combined with --depth
> exec git update-ref refs/heads/bundle-redo/clone HEAD 9e4a6fe9b68a8455b427c9ac8cdbff30c96653b4
>
> pick 5451cb6599c bundle-uri: create bundle_list struct and helpers
> pick 3029c3aca15 bundle-uri: create base key-value pair parsing
> pick a8b2de79ce8 bundle-uri: create "key=value" line parsing
> pick 92625a47673 bundle-uri: unit test "key=value" parsing
> pick a8616af4dc2 bundle-uri: limit recursion depth for bundle lists
> pick 9d6809a8d53 bundle-uri: parse bundle list in config format
> pick 287a732b54c bundle-uri: fetch a list of bundles
> exec git update-ref refs/heads/bundle-redo/list HEAD 287a732b54c4d95e7f410b3b36ef90d8a19cd346
>
> pick b09f8226185 protocol v2: add server-side "bundle-uri" skeleton
> pick 520204dcd1c bundle-uri client: add minimal NOOP client
> pick 62e8b457b48 bundle-uri client: add "git ls-remote-bundle-uri"
> pick 00eae925043 bundle-uri: serve URI advertisement from bundle.* config
> pick 4277440a250 bundle-uri client: add boolean transfer.bundleURI setting
> pick caf4599a81d bundle-uri: allow relative URLs in bundle lists
> pick df255000b7e bundle-uri: download bundles from an advertised list
> pick d71beabf199 clone: unbundle the advertised bundles
> pick c9578391976 t5601: basic bundle URI tests
> # Ref refs/heads/bundle-redo/rfc-3 checked out at '/home/stolee/_git/git-bundles'
>
> exec git update-ref refs/heads/bundle-redo/advertise HEAD c9578391976ab9899c4e4f9b5fa2827650097305
>
>
> The first two patches are helpers that are needed, but the full logic of the
> --update-refs option is introduced in patch 3. The config option is
> available in patch 4.

Very interesting; nice to see that others are thinking on a similar
wavelength.  I've got some related work in git-replay, and it heavily
uses update-refs style handling.

Not sure if there are any differences worth affecting your design or
mine, but perhaps it's worth mentioning my current work and plans in
git-replay related to the use of update-refs:

Since git-replay doesn't touch the working tree or index, I had the
natural question of what the output should be and whether ref-updates
should automatically happen.  I decided that like git-merge-tree, ref
updates should not be automatic (and thus it behaves somewhat like a
"dry run" version of rebase/cherry-pick).  Since I also want to
intrinsically handle replaying multiple branches simultaneously, the
update-ref style output seemed like a natural fit to me (meaning users
would pipe the output from git-reply to git-update-ref).  Thus, one
can do:

    $ git replay --onto main startpoint..mytopic

and see git-update-ref style output:

    update refs/heads/mytopic ${NEW_mytopic_HASH} ${OLD_mytopic_HASH}

where ${NEW_mytopic_HASH} is the commit at the tip of a chain of
rebased mytopic commits.  (Sidenote: There's the question of whether
to update 'main' (i.e. cherry-picking) or 'mytopic' (i.e. rebasing);
thus, users have to choose --advance vs. --onto to specify which of
these should happen, except in cases where it can be implicitly
determined.)

Or, if you have several branches to update:

    $ git replay --onto main ^startpoint mytopic1 mytopic2 mytopic3

and see

    update refs/heads/mytopic1 ${NEW_mytopic1_HASH} ${OLD_mytopic1_HASH}
    update refs/heads/mytopic2 ${NEW_mytopic2_HASH} ${OLD_mytopic2_HASH}
    update refs/heads/mytopic3 ${NEW_mytopic3_HASH} ${OLD_mytopic3_HASH}

(and yes, common commits from before will be replayed and then shared
by the new branch versions.)

If the topics are totally contained within each other, you can do this
more simply as

    $ git replay --contained --onto main main..my_tip_topic

and still see all the update commands:

    update refs/heads/mytopic1 ${NEW_mytopic1_HASH} ${OLD_mytopic1_HASH}
    update refs/heads/mytopic2 ${NEW_mytopic2_HASH} ${OLD_mytopic2_HASH}
    update refs/heads/mytopic3 ${NEW_mytopic3_HASH} ${OLD_mytopic3_HASH}
    update refs/heads/my_tip_topic ${NEW_my_tip_topic_HASH}
${OLD_my_tip_topic_HASH}

The option to list several refs provides a natural way of allowing
users to control what gets updated without updating all contained
branches, though rebase's implicit ranges doesn't really have a nice
analogue for you due to its implicit range handling.

Right now, git-replay doesn't handle conflicts (it just die()s).  That
and a few other thing are still TODO.

However, git-replay does already handle basic replaying of merges,
including ones with evil changes (so long as none of the merges
involved have conflicts).  So, I'm hoping git-replay will also useful
for stuff like:

   $ git replay --first-parent --interactive --onto next ^next
special_topic seen

where special_topic is one of the topics previously merged into seen.
This last commit would allow seen to be rebuilt with some changes to
special_topic.  Basically, it makes whatever interactive changes are
specified to special_topic followed by replaying all the merges in
next..seen (including any necessary conflict resolutions or other
semantic changes recorded in the merge commits being replayed), unless
those .  However, I do need to figure out some syntax for keeping
special_topic from being transplanted onto next here...

Phillip Wood June 7, 2022, 10:11 a.m. UTC | #11

On 06/06/2022 16:12, Derrick Stolee wrote:
> On 6/4/2022 11:28 AM, Phillip Wood wrote:
>> On 03/06/2022 19:27, Taylor Blau wrote:
>>> On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
>>>> This is a feature I've wanted for quite a while. When working on the sparse
>>>> index topic, I created a long RFC that actually broke into three topics for
>>>> full review upstream. These topics were sequential, so any feedback on an
>>>> earlier one required updates to the later ones. I would work on the full
>>>> feature and use interactive rebase to update the full list of commits.
>>>> However, I would need to update the branches pointing to those sub-topics.
>>>
>>> This is really exciting. I'm glad that you're working on it, because I
>>> have also wanted this feature a handful of times over the years.
>>
>> Yes, thank you Stolee. I agree this will be useful, but I wonder if users
>> will be confused that --update-refs only updates the branch heads that happen
>> to be in the todo list, rather than updating all the branches that contain a
>> rewritten commit. I think the latter is something we should try to add in the
>> future and so we should take care to design this topic so that is possible.
> 
> At the moment, the design adds a comment to the TODO list, showing which
> branches are not possible to move because they are checked out at another
> worktree (or is currently checked out and will be updated by the rebase
> itself). That seems like a good place to insert alternative logic in the
> future if we see a need for better behavior here.

I think the question of whether to update branches that are checked out 
in another worktree is a question of whether it is less inconvenient to 
the user to skip the ref update and leave the user to manually update 
the branch or to update the ref and leave the worktree in a potentially 
awkward state if the user was half way through building a commit. The 
answer probably depends on the preferences of the user.

I've been using a script that updates the refs for all the branches 
being rewritten for a while and have found it preferable to always 
update the ref rather than have to do it manually. My script also 
updates the worktree checkout to the new HEAD if there are no 
uncommitted changes which I have found very convenient. My preference is 
probably because I tend not to have uncommitted changes lying around in 
the worktrees whose branches get updated.

> Unless: am I misunderstanding something about your concern here? Are you
> worried about refs outside of refs/heads/*? Are you concerned about it
> being _all_ refs/heads/* that we find?

My concerns are primarily about being able to extend the --update-ref 
option in a backwards compatible way.

I'd like to be able to add functionality to rebase all refs/heads/* that 
are descended from the commits that a simple rebase would rewrite. Say I 
want to edit $commit then I want the rebase to rewrite all the commits 
and update all the branches in in

     git rev-list $(git for-each-ref --contains $commit refs/heads/) 
^$commit^@

Ideally we'd avoid adding a new commandline option when adding that. I 
think we could use an optional argument as you suggest below (though it 
would not be a refspec).

> One potential way to extend this (in the future) is to make --update-refs
> take an optional string argument containing a refspec. This would replace
> the default refspec of refs/heads/*.
> [...]  
>> Instead of using 'label' and 'exec' I'd prefer a new todo list command
>> ('update-ref' or 'update-branch'?) used in place of 'label' that takes a
>> branch name and updates the branch ref at the end of the rebase. That
>> would make it easy to do all the updates in a single transaction as you
>> suggested. Adding exec lines to do this makes the todo list messy and we
>> have been trying to stop rebase forking all the time.
> 
> Thanks. Yes, a new 'update-refs' step at the end would be good to make
> it clear we want to rewrite the refs in one go without a possible
> interruption from the user.

That's great, it is a bit more work for you but I think it gives a much 
nicer UI.

Best Wishes

Phillip

> Thanks,
> -Stolee

Derrick Stolee June 7, 2022, 7:39 p.m. UTC | #12

On 6/7/2022 6:11 AM, Phillip Wood wrote:
> On 06/06/2022 16:12, Derrick Stolee wrote:
>> On 6/4/2022 11:28 AM, Phillip Wood wrote:
>>> On 03/06/2022 19:27, Taylor Blau wrote:
>>>> On Fri, Jun 03, 2022 at 01:37:48PM +0000, Derrick Stolee via GitGitGadget wrote:
>>>>> This is a feature I've wanted for quite a while. When working on the sparse
>>>>> index topic, I created a long RFC that actually broke into three topics for
>>>>> full review upstream. These topics were sequential, so any feedback on an
>>>>> earlier one required updates to the later ones. I would work on the full
>>>>> feature and use interactive rebase to update the full list of commits.
>>>>> However, I would need to update the branches pointing to those sub-topics.
>>>>
>>>> This is really exciting. I'm glad that you're working on it, because I
>>>> have also wanted this feature a handful of times over the years.
>>>
>>> Yes, thank you Stolee. I agree this will be useful, but I wonder if users
>>> will be confused that --update-refs only updates the branch heads that happen
>>> to be in the todo list, rather than updating all the branches that contain a
>>> rewritten commit. I think the latter is something we should try to add in the
>>> future and so we should take care to design this topic so that is possible.
>>
>> At the moment, the design adds a comment to the TODO list, showing which
>> branches are not possible to move because they are checked out at another
>> worktree (or is currently checked out and will be updated by the rebase
>> itself). That seems like a good place to insert alternative logic in the
>> future if we see a need for better behavior here.
> 
> I think the question of whether to update branches that are checked out in
> another worktree is a question of whether it is less inconvenient to the user
> to skip the ref update and leave the user to manually update the branch or to> update the ref and leave the worktree in a potentially awkward state if the
> user was half way through building a commit. The answer probably depends on
> the preferences of the user.

I think that their 'git status' will look strange no matter what: their
working directory and index could look significantly different from what
the branch at HEAD is reporting. For this situation, I would rather continue
preventing these ref updates from underneath worktrees.
 
> I've been using a script that updates the refs for all the branches being
> rewritten for a while and have found it preferable to always update the ref
> rather than have to do it manually. My script also updates the worktree
> checkout to the new HEAD if there are no uncommitted changes which I have
> found very convenient. My preference is probably because I tend not to have
> uncommitted changes lying around in the worktrees whose branches get updated.

Actually updating the worktree to match seems like an interesting twist, which
we would want to consider if we go this route in the future.
 
>> Unless: am I misunderstanding something about your concern here? Are you
>> worried about refs outside of refs/heads/*? Are you concerned about it
>> being _all_ refs/heads/* that we find?
> 
> My concerns are primarily about being able to extend the --update-ref
> option in a backwards compatible way.
> 
> I'd like to be able to add functionality to rebase all refs/heads/* that
> are descended from the commits that a simple rebase would rewrite. Say I want
> to edit $commit then I want the rebase to rewrite all the commits and update
> all the branches in in
> 
>     git rev-list $(git for-each-ref --contains $commit refs/heads/) ^$commit^@
> 
> Ideally we'd avoid adding a new commandline option when adding that. I think
> we could use an optional argument as you suggest below (though it would not be
> a refspec).

This is sort of the opposite of what my series is doing: you want to find all
refs that contain the current commits and make sure that they are _also_
rebased as we rebase the current set of commits. That seems like an interesting
direction, with a very different set of complexities (ignoring other worktrees
seems like much more of a blocker here).

I would consider this to be _yet another mode_ that is separate from
--update-refs, though it could share some underlying logic. It gets more
complicated if there are merge commits involved (do we have a --rebase-merges
option?).

This makes me think of earlier discussions around a "git evolve" command
(based on Mercurial's evolve command). The idea is to have a different
workflow than I am presenting: instead of creating `fixup!` commits at the
top of a multi-part topic, you could check out a branch on a middle part
and work forward from there. Then, you could "rebase every branch with a
rewritten commit" to get things back into a nice line.

In conclusion: I don't think we should go down this rabbit hole right now.
I think that --update-refs would be something we want to enable by default
if we have such a mode, though.
 
>> One potential way to extend this (in the future) is to make --update-refs
>> take an optional string argument containing a refspec. This would replace
>> the default refspec of refs/heads/*.
>> [...] 
>>> Instead of using 'label' and 'exec' I'd prefer a new todo list command
>>> ('update-ref' or 'update-branch'?) used in place of 'label' that takes a
>>> branch name and updates the branch ref at the end of the rebase. That
>>> would make it easy to do all the updates in a single transaction as you
>>> suggested. Adding exec lines to do this makes the todo list messy and we
>>> have been trying to stop rebase forking all the time.
>>
>> Thanks. Yes, a new 'update-refs' step at the end would be good to make
>> it clear we want to rewrite the refs in one go without a possible
>> interruption from the user.
> 
> That's great, it is a bit more work for you but I think it gives a much
> nicer UI.

Thanks. After some hurdles on my end (and some additional complexities
discovered in the process) I will have v2 ready soon.

Thanks,
-Stolee

Junio C Hamano June 8, 2022, 4:03 p.m. UTC | #13

Derrick Stolee <derrickstolee@github.com> writes:

>> I think the question of whether to update branches that are checked out in
>> another worktree is a question of whether it is less inconvenient to the user
>> to skip the ref update and leave the user to manually update the branch or to
>> update the ref and leave the worktree in a potentially awkward state if the
>> user was half way through building a commit. The answer probably depends on
>> the preferences of the user.

To some degree, it might be true, but I think the recent thinking is
that by default any refs being worked on by the user must be kept
intact.  "switch" does not let you check out a branch that is
already checked out elsewhere, "fetch" does not let you overwrite
the branch that is currently checked out without "--update-head-ok",
etc.

> I think that their 'git status' will look strange no matter what: their
> working directory and index could look significantly different from what
> the branch at HEAD is reporting. For this situation, I would rather continue
> preventing these ref updates from underneath worktrees.
>  
>> I've been using a script that updates the refs for all the branches being
>> rewritten for a while and have found it preferable to always update the ref
>> rather than have to do it manually. My script also updates the worktree
>> checkout to the new HEAD if there are no uncommitted changes which I have
>> found very convenient. My preference is probably because I tend not to have
>> uncommitted changes lying around in the worktrees whose branches get updated.
>
> Actually updating the worktree to match seems like an interesting twist, which
> we would want to consider if we go this route in the future.

Usually I caution against adding "features" that can complicate the
end-user experience like this, but in this case, the potential for
extra conflicts coming from such updates to the working trees may
not be too bad.

It all depends on why the user has these intermediate branches and
how they are used, but in order to see any conflicts happen after
rebasing a branch here, before you start that rebase, you need to
have in different working trees that had these intermediate branches
checked out and had local modifications on top of them.

I would say that you deserve it if conflicts resulting from such a
set-up hurts you ;-)  If you are using these intermediate branches
so that you can work on steps in the middle of the larger whole,
then the proposed "rebase --update-refs" is not for you.  Having
local modification in these separate working trees that check out
the intermediate branches is OK, but once you create a commit on
such a branch, the commit that is in the larger topic is no longer
at the tip, so "rebase --update-refs" would not notice it anyway.

Rather, if you have the intermediate branches, you'd probably want
to work on each of them to make them stable, and rebase them on top
of each other in an appropriate order.  "rebase --update-refs" that
runs on the larger topic does not have enough information to rebuild
it with tips of intermediate branches that are updated elsewhere.

Which means that anybody sane who uses "rebase --update-refs" would
not modify these intermediate branches outside the context of the
larger topic, and those who do not follow this rule can keep both
halves of their history ;-)

FWIW a complementary tool that would work in the other direction, to
expect the user to have worked on smaller branches and rebuild the
larger branch that contains them, is also missing from our tool set.
Those who check out these smaller branches in separate working trees
would want to use such a tool, not "rebase --update-ref".

Thanks.

[0/4] rebase: update branches in multi-part topic

Message

Comments