[v1,5/5] sequencer: directly call pick_commits() from complete_action()
diff mbox series

Message ID 20190925201315.19722-6-alban.gruin@gmail.com
State New
Headers show
Series
  • [v1,1/5] sequencer: update `total_nr' when adding an item to a todo list
Related show

Commit Message

Alban Gruin Sept. 25, 2019, 8:13 p.m. UTC
Currently, complete_action() calls sequencer_continue() to do the
rebase.  Even though the former already has the todo list, the latter
loads it from the disk and parses it.  Calling directly pick_commits()
from complete_action() avoids this unnecessary round trip.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Phillip Wood Sept. 27, 2019, 1:26 p.m. UTC | #1
Hi Alban

Thanks for removing some more unnecessary work reloading the the todo list.

On 25/09/2019 21:13, Alban Gruin wrote:
> Currently, complete_action() calls sequencer_continue() to do the
> rebase.  Even though the former already has the todo list, the latter
> loads it from the disk and parses it.  Calling directly pick_commits()
> from complete_action() avoids this unnecessary round trip.
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>   sequencer.c | 8 +++++---
>   1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index ec7ea8d9e5..b395dd6e11 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -5140,15 +5140,17 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>   		return error_errno(_("could not write '%s'"), todo_file);
>   	}
>   
> -	todo_list_release(&new_todo);
> -
>   	if (checkout_onto(r, opts, onto_name, &oid, orig_head))
>   		return -1;
>   
>   	if (require_clean_work_tree(r, "rebase", "", 1, 1))
>   		return -1;
>   
> -	return sequencer_continue(r, opts);

sequencer_continue does a number of things before calling pick_commits(). It
  - calls read_and_refresh_cache() - this is unnecessary here as we've 
just called require_clean_work_tree()
  - calls read_populate_opts() - this is unnecessary as we're staring a 
new rebase so opts is fully populated
  - loads the todo list - this is unnecessary as we've just populated 
the todo list
  - commits any staged changes - this is unnecessary as we're staring a 
new rebase so there are no staged changes
  - calls record_in_rewritten() - this is unnecessary as we're starting 
a new rebase

So I agree that this patch is correct.

Thanks

Phillip

> +	todo_list_write_total_nr(&new_todo);
> +	res = pick_commits(r, &new_todo, opts);
> +	todo_list_release(&new_todo);
> +
> +	return res;
>   }
>   
>   struct subject2item_entry {
>
Junio C Hamano Oct. 2, 2019, 2:38 a.m. UTC | #2
Alban Gruin <alban.gruin@gmail.com> writes:

> Currently, complete_action() calls sequencer_continue() to do the
> rebase.  Even though the former already has the todo list, the latter
> loads it from the disk and parses it.  Calling directly pick_commits()
> from complete_action() avoids this unnecessary round trip.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  sequencer.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)

All the earlier steps in this series are necessary because the
inconsistencies caused by not doing things the code updated by the
earlier steps do (e.g. leaving number of commits recorded in
total_nr and done_nr stale) were masked by re-reading the info from
the on-disk file.  And this step exposes these inconsistencies
because the extra reading from the file no longer happens.

That is my reading of the series---did I get it right?

I wonder if that can be stated clearly in the log message of the
earlier steps.

For example, by reading 1/5 or 2/5 alone, it would be natural for
readers to wonder why the original code that did not update done_nr
correctly terminated after going through the todo list (you would
expect that when done reaches total you are finished, so if one of
these variables are not maintained correctly, your termination
condition may be borked), and then by knowing that the current code
more-or-less works correctly by not terminating too early or barfing
to say it ran out of things to do prematurely, the next thing
readers would wonder is if these patches introduce new bugs by
incrementing these variables twice (iow, "does the author of the
patch missed other places where these variables are correctly
updated?")

1/5 for example talks about the variable mostly being used by
prompts, which may be useful to reduce worries by readers about
correctness (iow, "well, if it is only showing wrong number in the
prompts and does not affect correctness otherwise, perhaps that was
why the current code that is buggy did not behave incorrectly").
But I suspect that it is not why these changes in the earlier steps
are acceptable or are right things to do.  So, perhaps replace the
"these are used only in prompts and the numbers being wrong does not
affect correctness" comments from them with:

    By forgetting to update the variable, the original code made it
    not reflect the reality, but this flaw was masked by the code
    calling (sometimes unnecessarily) read_todo_list() again to
    update the variable to its correct value before the next action
    happens.  At the end of this series, the unnecessary call will
    be removed and the inconsistency addressed by this patch would
    start to matter.

or something like that (assuming that the answer to the first
question I asked in this message is "yes", that is), or a more
concise version of it?

Thanks.
Johannes Schindelin Oct. 2, 2019, 8:20 a.m. UTC | #3
Hi,

On Fri, 27 Sep 2019, Phillip Wood wrote:

> Hi Alban
>
> Thanks for removing some more unnecessary work reloading the the todo list.
>
> On 25/09/2019 21:13, Alban Gruin wrote:
> > Currently, complete_action() calls sequencer_continue() to do the
> > rebase.  Even though the former already has the todo list, the latter
> > loads it from the disk and parses it.  Calling directly pick_commits()
> > from complete_action() avoids this unnecessary round trip.
> > Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> > ---
> >   sequencer.c | 8 +++++---
> >   1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index ec7ea8d9e5..b395dd6e11 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -5140,15 +5140,17 @@ int complete_action(struct repository *r, struct
> > replay_opts *opts, unsigned fla
> >    	return error_errno(_("could not write '%s'"), todo_file);
> >    }
> >   -	todo_list_release(&new_todo);
> > -
> >    if (checkout_onto(r, opts, onto_name, &oid, orig_head))
> >     return -1;
> >
> >    if (require_clean_work_tree(r, "rebase", "", 1, 1))
> >     return -1;
> >   -	return sequencer_continue(r, opts);
>
> sequencer_continue does a number of things before calling pick_commits(). It
>  - calls read_and_refresh_cache() - this is unnecessary here as we've just
> called require_clean_work_tree()
>  - calls read_populate_opts() - this is unnecessary as we're staring a new
> rebase so opts is fully populated
>  - loads the todo list - this is unnecessary as we've just populated the todo
> list
>  - commits any staged changes - this is unnecessary as we're staring a new
> rebase so there are no staged changes
>  - calls record_in_rewritten() - this is unnecessary as we're starting a new
> rebase
>
> So I agree that this patch is correct.

All true. Could this careful analysis maybe be included in the commit
message (with `s/staring/starting/`)?

Thanks,
Dscho

>
> Thanks
>
> Phillip
>
> > +	todo_list_write_total_nr(&new_todo);
> > +	res = pick_commits(r, &new_todo, opts);
> > +	todo_list_release(&new_todo);
> > +
> > +	return res;
> >   }
> >
> >   struct subject2item_entry {
> >
>
Alban Gruin Oct. 2, 2019, 6:03 p.m. UTC | #4
Hi Johannes,

Le 02/10/2019 à 10:20, Johannes Schindelin a écrit :
> Hi,
> 
> On Fri, 27 Sep 2019, Phillip Wood wrote:
> 
>> Hi Alban
>>
>> Thanks for removing some more unnecessary work reloading the the todo list.
>>
>> On 25/09/2019 21:13, Alban Gruin wrote:
>>> Currently, complete_action() calls sequencer_continue() to do the
>>> rebase.  Even though the former already has the todo list, the latter
>>> loads it from the disk and parses it.  Calling directly pick_commits()
>>> from complete_action() avoids this unnecessary round trip.
>>> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>>> ---
>>>   sequencer.c | 8 +++++---
>>>   1 file changed, 5 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/sequencer.c b/sequencer.c
>>> index ec7ea8d9e5..b395dd6e11 100644
>>> --- a/sequencer.c
>>> +++ b/sequencer.c
>>> @@ -5140,15 +5140,17 @@ int complete_action(struct repository *r, struct
>>> replay_opts *opts, unsigned fla
>>>    	return error_errno(_("could not write '%s'"), todo_file);
>>>    }
>>>   -	todo_list_release(&new_todo);
>>> -
>>>    if (checkout_onto(r, opts, onto_name, &oid, orig_head))
>>>     return -1;
>>>
>>>    if (require_clean_work_tree(r, "rebase", "", 1, 1))
>>>     return -1;
>>>   -	return sequencer_continue(r, opts);
>>
>> sequencer_continue does a number of things before calling pick_commits(). It
>>  - calls read_and_refresh_cache() - this is unnecessary here as we've just
>> called require_clean_work_tree()
>>  - calls read_populate_opts() - this is unnecessary as we're staring a new
>> rebase so opts is fully populated
>>  - loads the todo list - this is unnecessary as we've just populated the todo
>> list
>>  - commits any staged changes - this is unnecessary as we're staring a new
>> rebase so there are no staged changes
>>  - calls record_in_rewritten() - this is unnecessary as we're starting a new
>> rebase
>>
>> So I agree that this patch is correct.
> 
> All true. Could this careful analysis maybe be included in the commit
> message (with `s/staring/starting/`)?
> 

I will do so (same for your comment on 4/5) and resend this series as
soon as possible.

Cheers,
Alban



> Thanks,
> Dscho
> 
>>
>> Thanks
>>
>> Phillip
>>
>>> +	todo_list_write_total_nr(&new_todo);
>>> +	res = pick_commits(r, &new_todo, opts);
>>> +	todo_list_release(&new_todo);
>>> +
>>> +	return res;
>>>   }
>>>
>>>   struct subject2item_entry {
>>>
>>
Alban Gruin Oct. 2, 2019, 6:37 p.m. UTC | #5
Hi Junio,

Le 02/10/2019 à 04:38, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> Currently, complete_action() calls sequencer_continue() to do the
>> rebase.  Even though the former already has the todo list, the latter
>> loads it from the disk and parses it.  Calling directly pick_commits()
>> from complete_action() avoids this unnecessary round trip.
>>
>> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>> ---
>>  sequencer.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> All the earlier steps in this series are necessary because the
> inconsistencies caused by not doing things the code updated by the
> earlier steps do (e.g. leaving number of commits recorded in
> total_nr and done_nr stale) were masked by re-reading the info from
> the on-disk file.  And this step exposes these inconsistencies
> because the extra reading from the file no longer happens.
> 
> That is my reading of the series---did I get it right?
> 

Yes.

> I wonder if that can be stated clearly in the log message of the
> earlier steps.
> 
> For example, by reading 1/5 or 2/5 alone, it would be natural for
> readers to wonder why the original code that did not update done_nr
> correctly terminated after going through the todo list (you would
> expect that when done reaches total you are finished, so if one of
> these variables are not maintained correctly, your termination
> condition may be borked), and then by knowing that the current code
> more-or-less works correctly by not terminating too early or barfing
> to say it ran out of things to do prematurely, the next thing
> readers would wonder is if these patches introduce new bugs by
> incrementing these variables twice (iow, "does the author of the
> patch missed other places where these variables are correctly
> updated?")
> 
> 1/5 for example talks about the variable mostly being used by
> prompts, which may be useful to reduce worries by readers about
> correctness (iow, "well, if it is only showing wrong number in the
> prompts and does not affect correctness otherwise, perhaps that was
> why the current code that is buggy did not behave incorrectly").
> But I suspect that it is not why these changes in the earlier steps
> are acceptable or are right things to do.

Interestingly, it is the only reason for these two patches.  If you skip
them both, t34??-*.sh will not fail, only t9903-bash-prompt.sh will.
This is because `total_nr' does not reflect the real number of items in
a todo list, but the number of steps, done *and* remaining.  When a
rebase is resumed, some commands may already have been run, and are no
longer part of the todo list.  So, it’s not used to determine whether or
not the rebase is done.  `nr' is used for this purpose.  `done_nr' and
`total_nr' are just used for user interaction.

>  So, perhaps replace the
> "these are used only in prompts and the numbers being wrong does not
> affect correctness" comments from them with:
> 
>     By forgetting to update the variable, the original code made it
>     not reflect the reality, but this flaw was masked by the code
>     calling (sometimes unnecessarily) read_todo_list() again to
>     update the variable to its correct value before the next action
>     happens.  At the end of this series, the unnecessary call will
>     be removed and the inconsistency addressed by this patch would
>     start to matter.
> 
> or something like that (assuming that the answer to the first
> question I asked in this message is "yes", that is), or a more
> concise version of it?
> 

I will do so.

Cheers,
Alban



> Thanks.
>
René Scharfe Oct. 26, 2019, 7:47 a.m. UTC | #6
Am 25.09.19 um 22:13 schrieb Alban Gruin:
> Currently, complete_action() calls sequencer_continue() to do the
> rebase.  Even though the former already has the todo list, the latter
> loads it from the disk and parses it.  Calling directly pick_commits()
> from complete_action() avoids this unnecessary round trip.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  sequencer.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/sequencer.c b/sequencer.c
> index ec7ea8d9e5..b395dd6e11 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -5140,15 +5140,17 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>  		return error_errno(_("could not write '%s'"), todo_file);
>  	}
>
> -	todo_list_release(&new_todo);
> -

Moving this call down leaks new_todo on error...

>  	if (checkout_onto(r, opts, onto_name, &oid, orig_head))
>  		return -1;

... here ...

>
>  	if (require_clean_work_tree(r, "rebase", "", 1, 1))
>  		return -1;
... and here.  Perhaps set res to -1 and jump to the end?

>
> -	return sequencer_continue(r, opts);
> +	todo_list_write_total_nr(&new_todo);
> +	res = pick_commits(r, &new_todo, opts);
> +	todo_list_release(&new_todo);
> +
> +	return res;
>  }
>
>  struct subject2item_entry {
>
Alban Gruin Oct. 26, 2019, 11:27 a.m. UTC | #7
Hi René,

Le 26/10/2019 à 09:47, René Scharfe a écrit :
> Am 25.09.19 um 22:13 schrieb Alban Gruin:
>> Currently, complete_action() calls sequencer_continue() to do the
>> rebase.  Even though the former already has the todo list, the latter
>> loads it from the disk and parses it.  Calling directly pick_commits()
>> from complete_action() avoids this unnecessary round trip.
>>
>> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>> ---
>>  sequencer.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/sequencer.c b/sequencer.c
>> index ec7ea8d9e5..b395dd6e11 100644
>> --- a/sequencer.c
>> +++ b/sequencer.c
>> @@ -5140,15 +5140,17 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
>>  		return error_errno(_("could not write '%s'"), todo_file);
>>  	}
>>
>> -	todo_list_release(&new_todo);
>> -
> 
> Moving this call down leaks new_todo on error...
> 
>>  	if (checkout_onto(r, opts, onto_name, &oid, orig_head))
>>  		return -1;
> 
> ... here ...
> 
>>
>>  	if (require_clean_work_tree(r, "rebase", "", 1, 1))
>>  		return -1;
> ... and here.  Perhaps set res to -1 and jump to the end?
> 
>>
>> -	return sequencer_continue(r, opts);
>> +	todo_list_write_total_nr(&new_todo);
>> +	res = pick_commits(r, &new_todo, opts);
>> +	todo_list_release(&new_todo);
>> +
>> +	return res;
>>  }
>>
>>  struct subject2item_entry {
>>
> 

Thank you for the heads up.

Junio, how do you want me to fix that?  Should I reroll the series
altogether, send a "fixup!" commit, or send a standalone patch?

Cheers,
Alban
Junio C Hamano Oct. 28, 2019, 1:39 a.m. UTC | #8
Alban Gruin <alban.gruin@gmail.com> writes:

> Junio, how do you want me to fix that?  Should I reroll the series
> altogether, send a "fixup!" commit, or send a standalone patch?

Normally a topic that is not yet in 'next' or more stable tracks can
and should be rerolled (or in a rare case like changing just a
single typo on a line in the last patch in a series, can be
corrected with a "fixup!" squashed in); once the topic is in 'next'
or more stable integration branch, there needs a normal freestanding
patch that may say "earlier we did X, which is broken for such and
such reasons.  correct it this way."

But during a pre-release feature freeze period, the rules can be a
bit different ;-) Typically a topic that is not in 'master' that is
not a bugfix will never graduate from 'next' until the release
happens, and after the release, the tip of 'next' gets rebuilt from
the newly cut release, at which point a topic that wants a fresh
start (in order to avoid "oops, this was wrong, and here is a fix"
follow-up patch) can be granted one.

So I'd probably just send a "fixup!"to be queued on top of the
series to fix a leak in 'next' for now, remind the maintainer not to
merge it to 'master' before the release, and once the upcoming
release is made, send another reminder to the maintainer to squash
the "fixup!" in before rebuilding 'next', if I owned this series.

Thanks.
Junio C Hamano Oct. 28, 2019, 3:20 a.m. UTC | #9
Junio C Hamano <gitster@pobox.com> writes:

> So I'd probably just send a "fixup!"to be queued on top of the
> series to fix a leak in 'next' for now, remind the maintainer not to
> merge it to 'master' before the release, and once the upcoming
> release is made, send another reminder to the maintainer to squash
> the "fixup!" in before rebuilding 'next', if I owned this series.

This is based on René's suggestion, but seeing the pre-context of
this fix, many of the error code path, after an error return from
edit_todo_list() has been handled, can follow the pattern of this
fix to jump to the cleanup label after losing their own calls to
todo_list_release().  An obvious advantage of doing so is that it
future-proofs the codepath---the todo-list structure may not stay to
be the only thing that holds resources that need cleaned up.

 sequencer.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index b395dd6e11..ec0b793fc5 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5140,14 +5140,18 @@ int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return error_errno(_("could not write '%s'"), todo_file);
 	}
 
+	res = -1;
+
 	if (checkout_onto(r, opts, onto_name, &oid, orig_head))
-		return -1;
+		goto cleanup;
 
 	if (require_clean_work_tree(r, "rebase", "", 1, 1))
-		return -1;
+		goto cleanup;
 
 	todo_list_write_total_nr(&new_todo);
 	res = pick_commits(r, &new_todo, opts);
+
+cleanup:
 	todo_list_release(&new_todo);
 
 	return res;

Patch
diff mbox series

diff --git a/sequencer.c b/sequencer.c
index ec7ea8d9e5..b395dd6e11 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -5140,15 +5140,17 @@  int complete_action(struct repository *r, struct replay_opts *opts, unsigned fla
 		return error_errno(_("could not write '%s'"), todo_file);
 	}
 
-	todo_list_release(&new_todo);
-
 	if (checkout_onto(r, opts, onto_name, &oid, orig_head))
 		return -1;
 
 	if (require_clean_work_tree(r, "rebase", "", 1, 1))
 		return -1;
 
-	return sequencer_continue(r, opts);
+	todo_list_write_total_nr(&new_todo);
+	res = pick_commits(r, &new_todo, opts);
+	todo_list_release(&new_todo);
+
+	return res;
 }
 
 struct subject2item_entry {