mbox series

[RFC,0/1] rebase --onto detection of already applied commits

Message ID 20221212113516.27816-1-cristian.ciocaltea@collabora.com (mailing list archive)
Headers show
Series rebase --onto detection of already applied commits | expand

Message

Cristian Ciocaltea Dec. 12, 2022, 11:35 a.m. UTC
Let's consider the following operation:

  git rebase --onto new-base upstream feature

where 'feature' contains a few commits on top of 'upstream' which need to be
rebased onto 'new-base'.

The problem is that some of those commits have been already applied to
'new-base' and we would like to skip them as in a regular rebase command that
doesn't use '--onto', i.e. expecting to see messages like:

  warning: skipped previously applied commit [...]

However, that doesn't happen and we either get

  dropping [...] -- patch contents already upstream

or a conflict if one of the rebased commits doesn't resolve to an empty patch
anymore, e.g. due to additional changes applied on the target branch. 

This seems to be similar to the behavior of '--reapply-cherry-picks' and cannot
be disabled via '--no-reapply-cherry-picks' or by any other means.

The proposed patch is just a quick workaround, as I'm not sure what would be the
proper or recommended approach to handle the scenario described. Any advice
would be highly appreciated!

Cristian Ciocaltea (1):
  rebase --onto: Skip previously applied commits

 builtin/rebase.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Junio C Hamano Dec. 13, 2022, 12:13 a.m. UTC | #1
Cristian Ciocaltea <cristian.ciocaltea@collabora.com> writes:

> Let's consider the following operation:
>
>   git rebase --onto new-base upstream feature
>
> where 'feature' contains a few commits on top of 'upstream' which need to be
> rebased onto 'new-base'.

Isn't it what "git rebase new-base feature" for?  "My 'feature'
forked from where 'new-base' came from but they updated 'new-base'
so I do not know if some of what I had in my 'feature' is in
theirs. Please forward port what is still left in 'feature' on top
of updated 'new-base' that I just got from them".

The primary reason why we have an explicit "--onto" is so that
"rebase" is used just like

	git checkout --detach new-base
	git cherry-pick upstream..feature
	git checkout -B feature

to deal with a different situation, i.e. "My 'feature' forked from
'upstream', and I have a commit 'new-base'.  Just transplant the
whole thing on top of it", without having to worry about "what is
already in new-base?" at all.  After all, 'new-base' may not have
ANY ancestry relationship with the 'upstream', so "inspect commits
in the range upstream..new-base to exclude those that are the same
as the ones in upstream..feature" is not a well defined operation.
Cristian Ciocaltea Dec. 13, 2022, 10:37 a.m. UTC | #2
On 12/13/22 02:13, Junio C Hamano wrote:
> Cristian Ciocaltea <cristian.ciocaltea@collabora.com> writes:
> 
>> Let's consider the following operation:
>>
>>    git rebase --onto new-base upstream feature
>>
>> where 'feature' contains a few commits on top of 'upstream' which need to be
>> rebased onto 'new-base'.
> 
> Isn't it what "git rebase new-base feature" for?  "My 'feature'
> forked from where 'new-base' came from but they updated 'new-base'
> so I do not know if some of what I had in my 'feature' is in
> theirs. Please forward port what is still left in 'feature' on top
> of updated 'new-base' that I just got from them".

I should have highlighted that 'feature' contains a bunch of commits 
that must not be rebased on top of 'new-base', but only the ones in the 
'upstream..feature' range need to be considered.

> The primary reason why we have an explicit "--onto" is so that
> "rebase" is used just like
> 
> 	git checkout --detach new-base
> 	git cherry-pick upstream..feature
> 	git checkout -B feature
> 
> to deal with a different situation, i.e. "My 'feature' forked from
> 'upstream', and I have a commit 'new-base'.  Just transplant the
> whole thing on top of it", without having to worry about "what is
> already in new-base?" at all.  After all, 'new-base' may not have
> ANY ancestry relationship with the 'upstream', so "inspect commits
> in the range upstream..new-base to exclude those that are the same
> as the ones in upstream..feature" is not a well defined operation.
> 

Thanks for clarifying the intended use case of '--onto'. I have wrongly 
assumed it could be used as a more flexible rebase alternative, allowing 
one to limit the range of commits to be considered for rebasing on top 
of a given base, without losing the feature of detecting the already 
applied commits.

Currently '--onto' works as if the user provided the 
'--reapply-cherry-picks' flag, so maybe a possible improvement of this 
patch could be to handle '--no-reapply-cherry-picks' to explicitly 
enable the detection.

Would this be an acceptable extension?

Thanks,
Cristian
Junio C Hamano Dec. 13, 2022, 12:38 p.m. UTC | #3
Cristian Ciocaltea <cristian.ciocaltea@collabora.com> writes:

> Currently '--onto' works as if the user provided the
> '--reapply-cherry-picks' flag, so maybe a possible improvement of this
> patch could be to handle '--no-reapply-cherry-picks' to explicitly
> enable the detection.

That sounds like a workable approach that breaks no existing users.
Phillip Wood Dec. 13, 2022, 1:04 p.m. UTC | #4
Hi Christian

On 13/12/2022 10:37, Cristian Ciocaltea wrote:
> Currently '--onto' works as if the user provided the 
> '--reapply-cherry-picks' flag,

--onto does not affect the cherry-pick detection. When running

	git rebase --onto new-base upstream feature

any commits in upstream have been cherry-picked from feature they will 
not be rebased. What it does not do is look for cherry-picks in 
onto...feature. It would be nice to add that but I'm not sure it is 
straight forward to do so and still exclude commits that have been 
cherry-picked from feature to upstream.

Best Wishes

Phillip
Cristian Ciocaltea Dec. 13, 2022, 3:34 p.m. UTC | #5
Hi Phillip,

On 12/13/22 15:04, Phillip Wood wrote:
> Hi Christian
> 
> On 13/12/2022 10:37, Cristian Ciocaltea wrote:
>> Currently '--onto' works as if the user provided the 
>> '--reapply-cherry-picks' flag,
> 
> --onto does not affect the cherry-pick detection. When running
> 
>      git rebase --onto new-base upstream feature
> 
> any commits in upstream have been cherry-picked from feature they will 
> not be rebased. What it does not do is look for cherry-picks in 
> onto...feature. It would be nice to add that but I'm not sure it is 
> straight forward to do so and still exclude commits that have been 
> cherry-picked from feature to upstream.

The proposed patch enables looking for commits into new-base..feature 
range and excluding the ones reachable from upstream. Since this is a 
change in the existing behavior, we might need to introduce a new flag 
to enable it. I previously suggested to use '--no-reapply-cherry-picks' 
for this purpose, but now it's pretty obvious this will be a source of 
confusion, since the "cherry-picks" term refers to the commits picked 
from feature to upstream instead of new-base, as you already mentioned.

I agree it would be nice to support both exclusion ranges, but I'm not 
sure how complicated the implementation would be, since I don't have any 
previous experience with the Git internals. Could this be added as a 
separate feature at a later point?

Thanks,
Cristian

> Best Wishes
> 
> Phillip
Phillip Wood Dec. 15, 2022, 3:40 p.m. UTC | #6
Hi Cristian

On 13/12/2022 15:34, Cristian Ciocaltea wrote:
> Hi Phillip,
> 
> On 12/13/22 15:04, Phillip Wood wrote:
>> Hi Christian
>>
>> On 13/12/2022 10:37, Cristian Ciocaltea wrote:
>>> Currently '--onto' works as if the user provided the 
>>> '--reapply-cherry-picks' flag,
>>
>> --onto does not affect the cherry-pick detection. When running
>>
>>      git rebase --onto new-base upstream feature
>>
>> any commits in upstream have been cherry-picked from feature they will 
>> not be rebased. What it does not do is look for cherry-picks in 
>> onto...feature. It would be nice to add that but I'm not sure it is 
>> straight forward to do so and still exclude commits that have been 
>> cherry-picked from feature to upstream.
> 
> The proposed patch enables looking for commits into new-base..feature 
> range and excluding the ones reachable from upstream. Since this is a 
> change in the existing behavior, we might need to introduce a new flag 
> to enable it. I previously suggested to use '--no-reapply-cherry-picks' 
> for this purpose, but now it's pretty obvious this will be a source of 
> confusion, since the "cherry-picks" term refers to the commits picked 
> from feature to upstream instead of new-base, as you already mentioned.
> 
> I agree it would be nice to support both exclusion ranges, but I'm not 
> sure how complicated the implementation would be, since I don't have any 
> previous experience with the Git internals. Could this be added as a 
> separate feature at a later point?

If we can I'd rather add code that excludes cherry-pick both ranges. To 
remove the cherry-picks that are in upstream and new-base you could 
rework the todo list generation as follows

1. Calculate the merge-base $mb of feature and upstream
2. Store the list of commits $mb..feature in an array and in a hash
    table indexed their patch-id.
3. Walk $mb..upstream calculating the patch-id for each commit and
    removing any commit in the list from step 2 that matches.
4. If onto is equal to upstream skip to step 7
5. Calculate the merge-base $mb of feature and onto.
6. Walk $mb..new-base calculating the patch-id for each commit and
    removing any commit in the list from step 2 that matches.
7. Generate the todo list using the modified list of commits from step
    2.

I don't have much time at the moment but I can try and help a bit more 
in the New Year if you want.

Best Wishes

Phillip
Cristian Ciocaltea Dec. 15, 2022, 4:02 p.m. UTC | #7
Hi Phillip,

On 12/15/22 17:40, Phillip Wood wrote:
> Hi Cristian
> 
> On 13/12/2022 15:34, Cristian Ciocaltea wrote:
>> Hi Phillip,
>>
>> On 12/13/22 15:04, Phillip Wood wrote:
>>> Hi Christian
>>>
>>> On 13/12/2022 10:37, Cristian Ciocaltea wrote:
>>>> Currently '--onto' works as if the user provided the 
>>>> '--reapply-cherry-picks' flag,
>>>
>>> --onto does not affect the cherry-pick detection. When running
>>>
>>>      git rebase --onto new-base upstream feature
>>>
>>> any commits in upstream have been cherry-picked from feature they 
>>> will not be rebased. What it does not do is look for cherry-picks in 
>>> onto...feature. It would be nice to add that but I'm not sure it is 
>>> straight forward to do so and still exclude commits that have been 
>>> cherry-picked from feature to upstream.
>>
>> The proposed patch enables looking for commits into new-base..feature 
>> range and excluding the ones reachable from upstream. Since this is a 
>> change in the existing behavior, we might need to introduce a new flag 
>> to enable it. I previously suggested to use 
>> '--no-reapply-cherry-picks' for this purpose, but now it's pretty 
>> obvious this will be a source of confusion, since the "cherry-picks" 
>> term refers to the commits picked from feature to upstream instead of 
>> new-base, as you already mentioned.
>>
>> I agree it would be nice to support both exclusion ranges, but I'm not 
>> sure how complicated the implementation would be, since I don't have 
>> any previous experience with the Git internals. Could this be added as 
>> a separate feature at a later point?
> 
> If we can I'd rather add code that excludes cherry-pick both ranges. To 
> remove the cherry-picks that are in upstream and new-base you could 
> rework the todo list generation as follows
> 
> 1. Calculate the merge-base $mb of feature and upstream
> 2. Store the list of commits $mb..feature in an array and in a hash
>     table indexed their patch-id.
> 3. Walk $mb..upstream calculating the patch-id for each commit and
>     removing any commit in the list from step 2 that matches.
> 4. If onto is equal to upstream skip to step 7
> 5. Calculate the merge-base $mb of feature and onto.
> 6. Walk $mb..new-base calculating the patch-id for each commit and
>     removing any commit in the list from step 2 that matches.
> 7. Generate the todo list using the modified list of commits from step
>     2.
> 
> I don't have much time at the moment but I can try and help a bit more 
> in the New Year if you want.

Thank you for the implementation hints and your availability to help 
further! I will try to put this in practice and let you know as soon as 
I get something working.

Kind regards,
Cristian

> Best Wishes
> 
> Phillip
Phillip Wood Dec. 15, 2022, 5:07 p.m. UTC | #8
On 15/12/2022 16:02, Cristian Ciocaltea wrote:
> Hi Phillip,
> 
> On 12/15/22 17:40, Phillip Wood wrote:
>> Hi Cristian
>>
>> On 13/12/2022 15:34, Cristian Ciocaltea wrote:
>>> Hi Phillip,
>>>
>>> On 12/13/22 15:04, Phillip Wood wrote:
>>>> Hi Christian
>>>>
>>>> On 13/12/2022 10:37, Cristian Ciocaltea wrote:
>>>>> Currently '--onto' works as if the user provided the 
>>>>> '--reapply-cherry-picks' flag,
>>>>
>>>> --onto does not affect the cherry-pick detection. When running
>>>>
>>>>      git rebase --onto new-base upstream feature
>>>>
>>>> any commits in upstream have been cherry-picked from feature they 
>>>> will not be rebased. What it does not do is look for cherry-picks in 
>>>> onto...feature. It would be nice to add that but I'm not sure it is 
>>>> straight forward to do so and still exclude commits that have been 
>>>> cherry-picked from feature to upstream.
>>>
>>> The proposed patch enables looking for commits into new-base..feature 
>>> range and excluding the ones reachable from upstream. Since this is a 
>>> change in the existing behavior, we might need to introduce a new 
>>> flag to enable it. I previously suggested to use 
>>> '--no-reapply-cherry-picks' for this purpose, but now it's pretty 
>>> obvious this will be a source of confusion, since the "cherry-picks" 
>>> term refers to the commits picked from feature to upstream instead of 
>>> new-base, as you already mentioned.
>>>
>>> I agree it would be nice to support both exclusion ranges, but I'm 
>>> not sure how complicated the implementation would be, since I don't 
>>> have any previous experience with the Git internals. Could this be 
>>> added as a separate feature at a later point?
>>
>> If we can I'd rather add code that excludes cherry-pick both ranges. 
>> To remove the cherry-picks that are in upstream and new-base you could 
>> rework the todo list generation as follows
>>
>> 1. Calculate the merge-base $mb of feature and upstream
>> 2. Store the list of commits $mb..feature in an array and in a hash
>>     table indexed their patch-id.
>> 3. Walk $mb..upstream calculating the patch-id for each commit and
>>     removing any commit in the list from step 2 that matches.
>> 4. If onto is equal to upstream skip to step 7
>> 5. Calculate the merge-base $mb of feature and onto.
>> 6. Walk $mb..new-base calculating the patch-id for each commit and
>>     removing any commit in the list from step 2 that matches.
>> 7. Generate the todo list using the modified list of commits from step
>>     2.
>>
>> I don't have much time at the moment but I can try and help a bit more 
>> in the New Year if you want.
> 
> Thank you for the implementation hints and your availability to help 
> further! I will try to put this in practice and let you know as soon as 
> I get something working.

I'd start by looking at the existing todo list generation in 
sequencer.c:sequencer_make_script()

Best Wishes

Phillip